/ 

m-R124  £24  STOCHASTIC  DIFFERENTIAL  GAME  TECHNIOUES(U)  CALIFORNIA 
UNIV  LOS  ANGELES  SCHOOL  OF  ENGINEERING  ANDAPPLIED 
SCIENCE  6  HONS  HflR  82  DASG68-88-C-8887 

UNCLASSIFIED  .  F/G  12/1 

NL 

_ 

■ 

■ 

■ 

_ 

■ _ 

■ 

■ 

«■  , - t — — _ _ _ _ _ AD  A  \  24634 


;i  r.u  'V  '■>  ’  :S  P  A&!  '« \»n  hm-m  I  n'-rrdl 


REPORT  DOCUMENTATION  PAGE 


T 


SF.AD  InVtKUi  IK 'NS 
BEFORE  COMPLETING  FORM 


1  RE  PORT  NUMBER 

None 


12  GOVT  ACCESSION  MO. I  J  RECIPIENT'S  CATALOG  MlnBI  R 

I 


4  TITLE  fmnd  Submit) 

STOCHASTIC  DIFFERENTIAL  GAME  TECHNIQUES 


S  TYPE  OF  REPORT  4  PERIOD  COVERED 

Final,  11/29/79-11/28/81 


S.  PERFORMING  ORG.  REPORT  NUMBER 

None 


1.  AUTHOR!*; 

Bar end  Mens 

School  of  Engineering  &  Applied  Science,  UCLA 


B.  CONTRACT  OR  GRANT  NUMBER!*; 

DASG- 60-80- C-0007 


performing  organization  name  ano  aooress 

UCLA,  School  of  Engineering  and  Applied  Science 
Los  Angeles,  California  90024 


to  program  element,  project,  task 

AREA  •  WORK  UNIT  NUMBERS 


None 


I*.  CONTROLLING  OFFICE  name  ANO  AOORESS 

Advanced  Technology  Center 
U.S.  Army  Ballistic  Missile  Defense  Command 
Huntsville,  Alabama 


12.  REPORT  DATE 


Marph*.  .1982 


IS.  NUMBER  OF  PAGES 

155_ 


14  MONITORING  AGENCY  NAME  A  ADDRESS!//  dlllmrmn I  Iron  Controlling  Ollier) 


Same  as  No.  11 


IS.  SECURITY  CLASS.  (*/  Hum  import) 

Unclassif ied 


ISa.  DECLASSIFICATION-  DOWNGRADING 
SCHEDULE 


it  distribution  statement  tot  ihim  hmpon 


■D-tjtiibueiuu  nf"  tills1  duiuneiit^MB<p4«Rt«^i 


DISTRIBUTION  STATEMENT  A 

Approved  fax  public  release! 
Distribution  Unlimited 


17  DISTRIBUTION  STATEMENT  lot  H>m  mbmtrmcl  mnlmrmd  In  Block  30.  II  dlllmtmnl  Iron  Rmport. 


None 


16  SUPPL ement ary  notes 

.  None 


*  KEY  WORDS  I  Com  it 


II  n«t««tvy  and  Identity  by  block  number) 


BTJG 


B 


0  ABSTRACT  (Continue  on  tavmtaa  aide  II  netemamty  and  Idanilly  by  block  number) 

See  attached 


D 


9  0AM 
(  T) 


1473 


coition  of  I  NOV  SS  IS  OBSOLETE 
S/N  010?  LF  014  G601 


_  security  Classification  of  this  page  r»v»"  !>•••  •  •*<*»•# 

02..  018.  .68$ 


SECURITY  CLASSIFICATION  OF  this  PAGECWim  0»»»  Ent*r«d> 


4 


ABSTRACT 


The  development  of  readily  computable  strategies  for  differential 
games  with  noise  corrupted  measurements  frao  been  hampered  by  the  so 

A. 

called  closure  problem  of  stochastic  differential  games.  The  solutions 
required  either  an  infinite  dimensional  dynamic  system  or  the  determin¬ 
ation  at  each  time  t  of  the  error  in  the  opponent’s  state  estimate. 

In  this  dissertation,  solutions  to  differential  games  with  noise 

corrupted  measurements  havaf'%aan  obtained  that  are  readily  computable. 

a 

As  a  consequence  of  the  stochastic  aspects  of  such  games,  the 

hJCL. -< 

discussion  La b^ been  restricted  to  linear-quadratic  differential  games 
which  are  analyzed  using  function  space  techniques. 

The  solution  to  a  linear-quadratic  game  with  perfect  information  is 
obtained  without  the  a  priori  assumption  of  a  saddle-point  solution 
and  it  is  shown  that  the  individual  minimax  and  maximln  solutions  to 
such  a  game  result  in  a  set  of  strategies  that  satisfy  the  saddle- 
point  condition,  but  with  necessary  and  sufficient  conditions  that  are 
more  stringent  than  previously  obtained.  \ 

Following  recent  developments,  the  ccmcept  of  prior  and  delayed 
commitment  strategies  are  introduced  and  the  solutions  obtained  for  a 
game  where  one  player  has  perfect  state  information  and  the  other 
player  receives  noise  corrupted  measurements.  A  pursuit-evasion 
example  of  wuch  a  game  is  developed  and  by  solving  it  the  numerical 
differences  between  the  prior  and  delayed  commitment  solutions  for  this 

Ce  are  obtained. 

>The  concept  of  delayed  commitment  games  is  then  extended  to 
differential  games  where  both  players  have  noise  corrupted  state 
measurements  and  solutions  are  obtained  that  are  readily  computable, 
thus  playing  to  rest  the  closure  problem  of  stochastic  differential 
games. 

V 


I  f 


STOCHASTIC  DIFFERENTIAL 
GAME  TECHNIQUES 


by 


Barend  Mons 


March,  1982 

Submitted  under  contract 
DASG-6 0-80-C-000 7 
Advances  In  Technology  Development 
for  Exoatmospheric  Intercept  Systems 


U.S.  Army  Ballistic  Missile 
Defense  Command 
Advanced  Technology  Center 
Huntsville,  Alabama 


School  of  Engineering  and  Applied  Science 
University  of  California 
Los  Angeles,  California 


ABSTRACT 


The  development  of  readily  computable  strategies  for  differential 
games  with  noise  corrupted  measurements  has  been  hampered  by  the  so 
called  closure  problem  of  stochastic  differential  games.  The  solutions 
required  either  an  infinite  dimensional  dynamic  system  or  the  determin¬ 
ation  at  each  time  t  of  the  error  in  the  opponent's  state  estimate. 

In  this  dissertation,  solutions  to  differential  games  with  noise 
corrupted  measurements  have. been  obtained  that  are  readily  computable. 

As  a  consequence  of  the  stochastic  aspects  of  such  games,  the 
discussion  has  been  restricted  to  linear-rquadratic  differential  games 
which  are  analyzed  using  function  space  techniques. 

The  solution  to  a  linear-quadratic  game  with  perfect  information  is 
obtained  without  the  a  priori  assumption  of  a  saddle-point  solution 
and  it  is  shown  that  the  individual  minimax  and  maxim in  solutions  to 
such  a  game  result  in  a  set  of  strategies  that  satisfy  the  saddle- 
point  condition,  but  with  necessary  and  sufficient  conditions  that  are 
more  stringent  than  previously  obtained. 

Following  recent  developments,  the  concept  of  prior  and  delayed 
commitment  strategies  are  introduced  and  the  solutions  obtained  for  a 
game  where  one  player  has  perfect  state  information  and  the  other 
player  receives  noise  corrupted  measurements.  A  pursuit-evasion 
example  of  such  a  game  is  developed  and  by  solving  it  the  numerical 
differences  between  the  prior  and  delayed  commitment  solutions  for  this 
game  are  obtained. 

The  concept  of  delayed  commitment  games  is  then  extended  to 
differential  games  where  both  players  have  noise  corrupted  state 
measurements  and  solutions  are  obtained  that  are  readily  computable, 
thus  playing  to  rest  the  closure  problem  of  stochastic  differential 
games. 


Tfci  tiisi  faijfft  ^'-IhiV-fr  lA’nhg'ijls^T^w  ar  an  — mm  mtk  fl  r\  1 1 A> m  nna 

CONTENTS 

2a&« 

FIGURES  .  .  .  ... . 

TABLES  . 

1  INTRODUCTION  .  1 

2  GAME  THEORETIC  CONCEPTS  AND  MATHEMATICAL  BACKGROUND  ...  5 

2.1  Game  Theoretic  Concepts  .  5 

2.2  Review  of  Optimal  Control  Theory .  9 

2.3  Differential  Game  Formulation  .  16 

2.4  Solution  Concepts  . .  18 

2.4.1  Equilibrium  Solutions .  19 

2.4.2  Mlnimax  and  Maxlmln  Solutions .  20 

3  THE  LINEAR- QUADRATIC  PERFECT  INFORMATION  GAME .  25 

3.1  Llnear-Qiadratlc  Game  Formulation .  26 

3.2  Mlnimax  Solution .  31 

3.3  Maxlmln  Solution . -38 

3.4  Discussion .  43 

* 

4  INTRODUCTION  TO  STOCHASTIC  DIFFERENTIAL  GAMES  AND 

DELAYED  COMMITMENT  STRATEGIES .  30 

4.1  Games  with  Imperfect  State  Information .  51 

4.2  A  Tutorial  Example  . .  55 

5  THE  PERFECT/NOISY  DIFFERENTIAL  GAME .  62 

5.1  Problem  Formulation  and  Prior  Commitment 

Solution  .......  .  63 

5.2  Delayed  Commitment  Strategies .  67 

5.3  Discussion  . . 29 


CONTENTS  (Continued) 


Page 


A  PURSUIT-EVASION  EXAMPLE  .  88 

6.1  Problem  Formulation . . .  88 

6.2  Delayed  Commitment  Solution  .  102 

THE  NOISY/NOISY  DIFFERENTIAL  GAME .  120 


7.1  Delayed  Commitment  Solution  for  Player  l  .  .  .  .  121 

7.2  Delayed  Commitment  Solution  for  Player  2  .  .  .  •  135 

SIMMARY,  CONCLUSIONS,  AND  SUGGESTIONS  FOR 


FUTURE  WORK  .  145 

REFERENCES .  147 

APPENDIX  A .  149 


Accession  For 

NTIS  C-RAS.  I 
D7TC  TAR 
Urrvnnoir.c ,  *  Q 

i  *Jti-  ■  -t  ’ - ___ - 


FIGURES 


2.1  The  Extensive  Fora  of  e  Game  . . 6 

2.2  The  Koraal  Fora  of  a  Geae .  8 

4.1  Froblea  Classification  .  . . 54 

5.1  Relationship  Between  Prior  Commitment  and  Delayed 

Conmltaent  Payoffs .  8b 

6.1  Geometry  of  the  Pursuit-Evasion  Problem  at  time  t  .  .  .  89 

6.2  Error  Variance  of  Player  2  in  Delayed  Commitment 

Gaae . 113 

6.3  Error  Variance  of  Player  2  in  Prior  Conaltaent 

Game  .  . . lib 

6 .4  Feedback  Galna  Versus  Time .  115 

6.5  Feedback  Gains  Versus  Tlae .  116 

6.6  Relative  Payoff  Versus  Tine .  119 


l 


TABLES 


-3.1  Suoaary  of  Optimal  Deterministic  Strategic*  .  26 

5.1  Smeary  of  the  Prior  Commitment  Stretegles .  80 

5.2  Summary  of  the  Delayed  Coomltment  Strategic* .  81 

6.1  Constant*  and  Parameter*  Used  in  •  Numerical 

Example  of  a  Pursuit-Evasion  Game .  Ill 

6.2  Value*  of  G^(t)N(t)  at  t  ■  0 .  117 

6.3  Cooparlaon  of  G1(t)N(T)  with  G1(t)  |»j(t)  -  S(T)J 

for  W2  -  100  ft2  .  117 


CHAPTER  1 


INTRODUCTION 

The  theory  of  game*  may  be  deecribed  aa  the  mathematical  theory 
of  dec! a Ion •making  by  participants,  or  playera,  in  a  competitive 
environment.  In  a  typical  problem  each  player  haa  some  control  over 
the  outcome  of  a  particular  event,  or  game,  and  the  theory  la  con¬ 
cerned  with  finding  the  optimal  course  of  action,  or  strategy,  taking 
into  account  the  possible  actions  of  the  opponents.  Although  some 
game  theoretic  concepts  can  be  traced  over  the  past  couple  of  centuries, 
modern  game  theory  dates  from  1944  with  the  publication  of  the  now 
classical  work,  '^Theory  of  Games  and  Economic  Behavior,"  by  von  Neumann 
and  Morgenatarn  ( 1  J  . 

In  differential  games  the  Ideas  of  game  theory  are  applied  to 
dynamic  conflict  situations  which  can  be  described  by  differential 
equations  (continuous  tiac)  or  difference  equations  (discrete  time). 

The  dynamic  system  Is  under  control  of  Intelligent  adversaries  each 
seeking  to  optimise  his  own  gain  at  the  expense  of  that  of  his  oppo¬ 
nents,  using  all  the  available  Information  to  achieve  his  objective, 
and  having  no  a  priori  knowledge  of  what  the  opponents  are  going  to  do. 
Differential  game  theory  was  first  defined  and  studied  by  Isaacs  [2-5] 
in  1954  at  the  Band  Corporation  and  It  was  only  upon  the  publication 
of  his  book,  "Differential  Games"  ((]  In  1965,  that  the  Interest  In 
the  subject  became  widespread. 

Fundamental  to  the  analysis  of  a  game  Is  the  formulation  of  a 
mathematical  model,  which  Includes  the  payoff,  the  allowable  strategics 


and  tha  available  lnfonaatlon  sate  upon  which  tha  playara  must  base 
Chair  daclelona.  Zf  tha  InCaraac  la  on  detail,  Information  and  fine 
etructure,  tha  axtenalva  fora  of  a  game  la  often  uaad;  while  If  the 
etreea  le  on  strategies  and  payoffs,  the  strategic  or  normal  form  of 
a  game  le  usually  employed. 

A  fundamental  tenet  of  game  theory  Is  the  Normalisation 
Principle  of  von  Neumann,  which  says  that  given  a  game  In  extensive 
form  It  can  always  be  reduced  to  an  equivalent  game  In  normal  form. 
Although  the  number  of  possible  strategies  In  the  normal  form  becomes 
rapidly  enormous,  the  conceptual  simplification  makes  It  In  practice 
a  much  simpler  problem  for  computing  optimal  strategies.  As  a  conse¬ 
quence  most  of  the  existing  results  In  game  theory  are  for  games  In 
normal  form.  However,  there  Is  still  a  major  concern  whether  thla 
approach  la  philosophically  sound.  Aumann  and  Maschler  [7]  recently 
re-examined  the  Normalisation  Principle  and  Illustrate  via  a  simple 
example  some  of  the  pitfalls  In  the  passage  from  the  extensive  to  the 
normal  form  of  a  game.  Their  results  have  lamedlate  and  serious  con¬ 
sequences  In  differential  games  with  Imperfect  etate  Information.  In 
effect,  previously  obtained  results  of  games  with  Imperfect  Information 
are  useful  and  reasonable  only  If  tha  players  are  Irrevocably  commit¬ 
ted  to  a  strategy  determined  at  the  beginning  of  the  game  (the  prior 
coanltment  strategy).  This  severely  limits  their  applicability,  not 
to  mention  that.  In  general,  thase  strategies  can  only  be  realised  by 
Infinite  dimensional  atata  estimators  [8)  .  This  paper  la  therefore 
concerned  with  determining  tha  strategies  (the  delayed  commitment 
strategies)  for  differs?  'lal  game*  with  Imperfect  Information  where 


2 


Che  players  are  not  Irrevocably  committed  to  their  prior  commitment 
solution.  The  class  of  games  are  restricted  to  linear  time  varying 
. differential  games  with  noise  corrupted  measurements  and  a  quadratic 
payoff  function.  The  allowable  strategies  are  closed* loop,  based  at 
each  time  t  on  all  the  available  Information  up  to  that  time  and  the 
final  time  T  Is  fixed. 

Chapter  2  presents  the  various  concepts  of  game  theory  and  a 
brief  review  of  those  aspects  of  modern  optimal  control  theory  that 
are  pertinent  to  the  later  chapters.  The  theoretical  development 
begins  In  Chapter  3,  with  a  careful  definition  and  analysis  of  a 
linear* quadratic  differential  game  with  perfect  information. 

Chapter  .4  Introduces  the  stochastic  differential  game  and 
illustrates  the  prior  commitment  and  delayed  commitment  strategy  via 
a  tutorial  exasiple. 

The  prior  commitment  solution  obtained  by  Behn  and  Ho  [9  J  and 
Rhodes  and  Luenberger  [10]  to  a  linear* quadratic  differential  game 
where  the  minimising  player  haa  perfect  measurements  and  the  staxlmls* 
Ing  player  has  noise  corrupted  measurements  of  the  state  is  presented 
In  Chapter  5.  The  delayed  commitment  solution  to  this  problem  Is 
then  obtained  and  the  results  are  compared  with  those  of  the  prior 
cousltsmnt  solution. 

To  Illustrate  the  results  obtained  In  Chapter  5  we  analyse  a 
pursuit -evasion  example  In  Chapter  6  that  also  allows  a  finite 
dimensional  solution  using  the  prior  commitment  formulation.  The 
solutions  to  both  formulations  have  been  obtained  and  their  character¬ 
istics  compared. 


3 


The  delayed  commitment  formulation  le  then  extended.  In  Chapter 
7,  to  the  case  vhere  both  playera  have  nolle  corrupted  measurement! 
and  finite  dimensional  solutions,  which  are  readily  computable,  are 
obtained  for  both  players. 


4 


CHAPTER  2 


GAME  THEORETIC  CONCEPTS  AMD  MATHEMATICAL  BACKGROUND 

As  pointed  out  In  the  Introduction,  the  study  of  differential 
games  Is  the  dynamical  equivalent  of  the  problems  studied  In  classical 
game  theory.  Although  many  of  the  analytical  methods  for  differential 
games  are  actually  extensions  of  techniques  developed  In  optimal  con¬ 
trol  theory,  the  Important  concepts  In  differential  games  come  mainly 
from  general  game  theory. 

The  fundamental  concepta  of  game  theory  are  Introduced  In  thla 
chapter  with  a  dlacuaslon  of  two  basic  game  models.  A  brief  review  of 
those  aspects  of  modern  optimal  control  theory  relevant  to  the  sequel 
Is  then  presented,  and  a  general  mathematical  representation  of  a 
differential  game  formulated.  The  chapter  Is  concluded  with  a  dis¬ 
cussion  of  the  solution  concepts  of  differential  games. 

2.1  GAME  THEORETIC  CONCEPTS 

The  success  or  failure  of  an  analysis  using  game  theory  often 
hinges  upon  the  ability  to  adequately  model  a  physical  situation.  The 
way  In  which  a  game  model  Is  formulated  depends  upon  our  Interests  and 
the  type  of  analysis  to  be  performed.  The  two  basic  descriptions  of  a 
game  of  Interest  to  us  are: 

1.  the  extensive  form,  and 

2.  the  strategic  or  normal  form. 

The  extensive  form  of  a  game  can  be  Illustrated  by  means  of  a 
diagram  known  as  the  seam  tree,  shown  In  Figure  2.1  for  a  simple  two- 


5 


person  |im.  In  this  rsprsssntstlon  of  s  gnt,  the  choice  of  the  first 


(6.  -7)  (0,  0)  (0,  -3)  (6,  -4) 

Figure  2*1.  The  Extensive  Fora  of  e  Geae 


pleyer  eaounts  to  selecting  one  of  the  two  brenehes  eaenetlng  from  the 
point  Pj,  After  player  1  has  wade  his  choice,  the  second  pleyer  has 
to  choose  e  branch  at  one  of  the  two  locations  Barked  Pj.  In  our 
staple  geae,  after  both  players  have  selected  a  branch,  the  payoff  is 
given  by  the  two  nuabers  at  the  end  of  the  branches.  In  order  to 
Indicate  that  both  players  wove  slaultaneously  we  enclose  both  of  the 
nodes  at  ?2  by  a  curve  which  Indicates  an  Inforaatlon  set.  If  the 
second  pleyer  knows  at  the  tlaa  he  aoves  what  the  first  player  has 
chosen,  we  would  then  draw  a  separate  Inforaatlon  set  around  each  of 
the  nodes. 

When  engaged  In  a  particular  geae,  each  player  Is  faced  with 
the  problea  of  how  best  to  play  the  gaae  In  order  to  aaxialse  or  alnl- 
alse  his  expected  payoff.  A  player's  coaplete  plan  for  playing  a  gaae 
la  called  a  strategy,  of  which  there  are  several  different  types.  A 
aura  strategy  for  player  1  Is  a  rule  for  selecting  e  particular  aove 


at  each  of  hia  Information  aata.  A  mixed  atratcgy  for  player  1  la  a 
probability  dlatrlbutlon  over  the  aet  of  all  pure  atrateglea.  A 
behavioral  atrategy  for  playar  1  coaalata  of  a  collection  of  probability 
distributions,  one  each  over  the  aet  of  poaalble  choices  at  each  of  hla 
Information  aata.  A  game  for  which  tha  aw  of  the  payoff's  at  each 
terminal  noda  la  equal  to  aero  la  called  a  aero-aum  game,  all  other 
gamea  are  nonaero-ataa.  In  1912,  Zermelo  (aee  [1])  demonatrated  the 
exlatence  of  an  optimal  pure  atrategy  for  two-person  aero-aum  gamea 
with  perfect  Information,  that  la  gamea  In  which  all  Information  aeta 
contain  a  a ingle  node.  Kuhn  fill  extended  thla  reault  to  n-peraon 
general-aum  gamea  with  perfect  Information.  Kuhn  alao  allowed  the 
exlatence  of  optimal  behavioral  atrateglea  for  gamea  with  perfect 
recall.  A  game. has  perfect  recall  If  each  player  la  aware,  at  each  of 
hla  movra,  of  precisely  what  moves  he  picked  prior  to  It,  but  may  not 
know  all  the  choices  made  by  the  other  players.  In  1928,  von  Neumann 
showed  the  existence  of  optimal  mixed  strategies  for  any  two-person 
aero-aum  game,  which  la  tha  well-known  Mlnlmax  or  fundamental  Theorem 
of  Game  Theory. 

Another  of  the  fundamental  tenets  of  game  theory  la  the  Normal¬ 
isation  Principle  of  von  Neumann,  which  says  that  given  a  gan.  In 
extensive  form  It  can  always  be  reduced  to  an  equivalent  game  In 
normal  involving  only  strategies  and  payoffs.  The  above  example 

of  e  game  In  extensive  form  reduces  In  its  normal  form  to  a  2  x  2 
matrix  gams  shown  In  Figure  2.2.  In  this  form,  the  dynamic  and  Infor- 
mat tonal  aspects  of  the  original  problem  have  been  suppressed  Into  the 
atrategy  which  covers  all  contingencies  of  the  players. 


Player 
i'a  Choice 


1 


2 


Player  2's  Choice 


Figure  2.2.  The  Normal  Porn  of  *  Geme 


When  e  game  Is  constrained  by  a  system  that  evolves  over  time 
(or  same  other  parameter)  It  Is  called  a  dynamic  game.  If  the  dynamic 
system  representation  takes  the  form  of  a  difference  equation,  the 
game  Is  known  as  a  discrete  differential  or  multistage  game.  The 
designation  differential  game  Is  reserved  for  e  dynamic  game  where  the 
dynamic  system  representation  Is  in  the  form  of  a  differential  equation 
We  will  have  more  to  say  about  the  differential  game  representation 
In  Section  2.3.  At  this  stage  It  should  be  noted  that  Implied  In  the 
formulation  of  a  game  Is  the  assumption  that  the  players  "agree"  on 
the  structure  of  the  model  as  well  as  what  Is  important  to  both  players 
as  expressed  by  the  payoff  or  payoff  function. 

In  this  paper  we  will  be  mainly  concerned  with  two-person 
differential  gaams  with  perfect,  as  well  as  with  Imperfect  Information, 
They  represent  an  extension  of  optimal  control  theory,  in  that  the 
optimal  control  problem  can  be  considered  as  a  one-sided  game.  That 
Is  a  game  with  only  one  control  Input  driving  a  dynamical  system  In¬ 
stead  of  two  opposing  controls  as  in  two-person  differential  games. 

In  terms  of  the  matrix  game  of  Figure  2.2,  a  one-player  game  would 


consist  of  simply  a  single  row  or  column.  The  development  In  this 
paper  will  be  from  the  optimal  control  system  point  of  view  and  we 


will  therefore  flret  discus*  the  general  optimal  control  problem  In 
the  following  section. 

2.2  BEVIEW  OF  OPTIMAL  CONTROL  THEORY 

In  this  section  we  will  present  e  brief  discussion  of  those 
espects  of  modern  opt lasl  control  theory  thet  ere  pertinent  to  our 
discussion  of  different lei  games. 

Vs  will  first  forwulete  e  generel  deterministic  optimal  control 
problem  end  discuss  the  basic  methods  of  solution.  We  will  then  modify 
this  problem  to  e  stochastic  optimal  control  problem,  after  which  atten¬ 
tion  Is  focussed  on  the  llneer-quedretle-Gausslen  problem.  For  this 
problem  we  discuss  the  Certainty  Equivalence  Principle  or  Separation 
Theorem,  including  the  notions  of  controllability,  observability  end 


optimal  estimation. 

In  the  general  optinal  control  problem  one  wishes  to  determine 
the  p-component  control  vector  u(t)  thet  minimises  the  given  cost 


functional 


J(to,*o,u)  -  B(x(T)  ,T)  +  J  F(x(t)  ,u(t)  ,t)dt  (2.1) 


subject  to  the  constraints 


J7  ■  *  -  f(x(t),u(t),t)  ;  x(to)  -  xo 


(2.2) 


The  n-component  vector  x  Is  the  state  vector  and  Equation  (2-  2)  Is 
known  as  the  dynamic  system  equation.  The  n-vector  function  f,  as 
well  as  the  scalar  functions  B  and  F  are  assumed  to  be  sufficiently 
smooth  In  the  sense  thet  all  the  necessary  partial  derivatives  exist. 
In  addition,  there  may  be  magnitude  or  Inequality  constraints  on  the 


state  and  control  variables,  as  vail  as  restrictions  on  the  terminal 
stats.  The  terminal  tine  T  ma y  be  variable  or  fixed;  here  It  Is 
assumed  fixed  for  slnpllcltj. 

The  optimal  control  problem  Is  then  to  find  that  control  function 
u(t)  (If  It  exists)  defined  on  the  Interval  |  j  that  satisfies  all 
the  problem  constraints  and  Is  optimal  In  the  sense  that  It  slmultan- 
eously  minimises  the  cost  function.  In  other  words,  we  wish  to  find 
the  allowable  control  function  u*(»).  such  that  for  any  control  u (• ) 
belonging  to  the  allowable  control  function  set  U,  there  holds  for  all 

1 4  ho*1! 

J(t  ,x  ,u*)  <  J(t  ,x  ,u)  (2.3) 

o  o  o  o 

Basically,  four  methods  of  approach  are  available  to  solve  the  optimal 
control  problem;  they  are, 

1.  The  classical  calculus  of  variations  approach,  which  leads 
to  the  Eular-La grange  equations  as  tha  necessary  conditions 
for  tha  control  to  be  optiawl. 

2.  The  Maximum  Principle  of  Pontrvaaln  approach,  which  pro¬ 
vides  the  necessary  conditions  for  optimality.  It  Is 
usually  the  most  direct  method  for  problems  Involving 
magnitude  constraints. 

3.  The  dynamic  prosranmlna  approach,  which  leads  to  tha 
Hamllton-Jacobl  equations.  Although  the  Hamilton- Jacobi 
equation  cannot  be  easily  solved  In  general,  u(t)  Is 
determined  as  a  function  of  x(t),  or  In  other  words,  we 
find  a  feedback  control  law  which  Is  highly  desirable. 


4.  The  functional  Milflii  approach.  Its  appeal  stems  primer 
lip  from  Its  geometric  character  and  Is  most  useful  for 
problems  formulated  on  a  fixed  time  Interval. 

In  this  paper  ve  will  almost  exclusively  use  the  functional 
analysis  approach  to  obtain  the  solution  to  optimal  control  and  differ 
entlal  game  problems. 

Frequently,  It  Is  required  to  obtain  on-line  feedback  or  closed 
loop  control  of  the  dynamic  system^  l.e.,  we  seek  a  solution  of  the 
form  u(t)  -  u(x(t),t).  However,  restricting  the  allowable  controls  to 
belong  to  the  set  U  u(t)  -  u(x(t),t)  greatly  complicates  the  deter 
alnatlon  of  a  solution.  In  fact,  of  the  four  basic  approaches  listed 
above,  only  the  dynamic  programing  approach  directly  provides  a 
closed  loop  solution.  Otherwise,  the  dependence  of  the  control  u(t) 
on  x(t)  can  be  explicitly  Identified  only  for  a  linear  dynamic  system 
with  a  quadratic  cost  functional. 

If  the  system  dynamics  (Equation  (2.2))  are  perturbed  by  ran¬ 
dom  disturbances,  and/or  if  the  initial  conditions  are  random,  and/or 
If  the  only  available  Information  about  the  state  x(t)  is  available 
through  noise  corrupted  measurements  of  the  state  variables,  the 
deterministic  optimal  control  problem  becomes  a  stochastic  optimal 
control  problem.  In  this  case,  the  criterion  of  optimality  needs  to 
be  modified  to  that  of  minimising  the  expected  value  of  the  cost 
functional. 

Thus,  by  postulating  that  the  only  available  Information  about 
the  state  of  the  system  can  be  obtained  by  measurements  of  the  form 


*(t)  -  h(x(t),w(t),t) 


(2.4) 


vhert  the  output  vetor  *(t)  la  of  dimension  m&n,  the  function  h(«,-) 
la  sufficiently  smooth  In  anch  argument  and  w(t)  la  a  random  nolaa 
procaaa.  It  follows  that  we  ara  dealing  vlth  a  etochaetlc  control  pro¬ 
blem.  The  converalon  to  •  atochaatlc  optimal  control  problem  la  com¬ 
pleted  by  modifying  the  optimality  criterion  to  that  of  minimizing  the 
expected  value  of  the  coat  functional;  l.e.. 


T 

J(u)  -  1  |  B(x(T) ,T)  +  ^  F(x(t),u(t),t)dt|  (2.5) 


Furthermore,  It  le  neceeaary  to  aeek  a  closed -loop  solution,  thus  the 
allowable  controls  are  of  the  form 


u (t)  -  u(Z(t),t).  (2.6) 

where 

2(t)  -  |  (a (a), a)  ;  scto,t)J  (2.7) 

l.e.,  the  control  u  at  time  t  depends  on  the  past  and  present 
values  of  the  measurement  history  Z(t). 

The  claea  of  problems  for  which  a  closed-form  analytical  solu¬ 
tion  to  the  stochastic  optimal  control  problem  has  been  found  le  the 
case  of  a  linear  system,  a  quadratic  cost  functional  and  white  sero- 
mean  Gaussian  noise  addltlvely  corrupting  the  measurements  of  the 
system  output.  For  this  special  case,  the  optimal  closed-loop  solution 
le  given  by  the  Important  Certainty  Equivalence  Principle  or  Separation 
Theorem. 

To  review  the  Separation  Theorem,  we  will  consider  the  linear 
continuous  time  system  described  by  the  vector  differential  equation 


12 


(2.8) 


-  *  -  F(t)x(t)  -  G(fc)u(t)  ;  *(to)  -  70 

to  which  arc  available  measurements  of  tha  form 

*(t)  -  H(t)x(t)  +  w(t)  (2.9) 

vhara  x(t)  la  an  n-dlaenslonal  atata  vector,  u(t)  la  a  p-dlmenalonal 

control  vector,  s(t)  la  tha  output  vector  of  dime no Ion  m<  n  and  tha 

matrices  F(t),  G(t)  and  H(t)  have  tha  appropriate  dlaenalon. 

Tha  lnltlt.1  atata  x(tQ)ls  aaaumad  a  Gauaalan  random  variable 

with  mean  E  |  x(tQ) J  ■  *o  and  eov  |  x(to),x(to) J  -  PQ.  Tha  additive 

nolae  w(t)  la  asstaMd  white  and  Gauaalan  with  aero  mean,  eov  [  w(t),w(r) 

-  tf(t)6  (t  -T)  and  lndepandant  of  the  Initial  condition  x(to). 

Conaldar  alao  tha  quadratic  coat  functional 

T 

J(u)  -  1/2  e|xT(T)x(T)  +  J  uT(t)u(t)dt|  (2.10) 

*o 

where  the  final  time  T  la  fixed  and  finite,  and  the  auperacrlpt  T 
denotea  tranapoaltlon. 

Let  the  net  U  of  allowable  control  functions  be 

U  :  u(t)  ■  u(Z(t),t)  (2.11) 

where 

Z(t)  -  |  (a (a), a)  ;  a  «  (to,t)  |  ,  (2.12) 

then  the  objective  la  to  find  that  u* (t) <  U  such  that 


13 


(2.13) 


B  |j(u*(t))J<  B  |  J(u(t))  J 

. for  all  t  c  |  to»T  |  . 

Tha  solution  to  this  problem  may  be  stated  In  three  parts; 

1.  The  optimal  closed-loop  solution  to  the  corresponding 

deterministic  optimal  control  problem;  i.e.,  for  x(t  ) 

o 

known  exactly,  H(t)  -  1  the  Identity  matrix  and  w(t)  -  0, 
may  be  written  as 

u*(t)  -  GT(t)S(t)x(t)  (2.14) 

where  the  n  x  n  synsetrlc  matrix  S(t)  may  be  precomputed 
from  the  matrix  Rlecatl  equation. 

S  -  -  S(t)f(t)  -  fT(t)S(t)  ♦  S(t)C(t)CT(t)S(t)  (2.15) 

with  the  terminal  condition 

S(T)  -  I  (2.16) 

If,  in  addition,  (F,G)  constitutes  a  controllable  pair; 
l.a. ,  If 
T 

/  •(T,t)G(t)GT(t)*I(T,t)dt  >  0  (2.17) 

t 

o 

where  •  (t,to)  is  the  system  stste  transition  matrix  trttlch 
must  aatlsfy  the  relation 


14 


(2.18) 


- - - 2--  F(t)  *(t,t  >  , 

S-  o 


*(to>to)  "  1  • 

then  S(t)  estate  end  la  bounded  for  ell  t  S  T. 

2.  The  optimal  eloaed  loop  aolutlon  to  the  etochaatlc  optimal 


control  problem  la 

«*(t)  -  C1T(t)S(t)^(t) 

(2.19) 

where 

$(t)  -  E  jx(t)  |  Z(t)| 

(2.20) 

with  Z(t)  given  by  Equation  (2.7),  that  la,  x(t)  la  the 
expected  value  of  x(t)  given  the  meaaurementa  *(t)  up  to 
time  t.  The  matrix  G1T(t)S(t)  la  the  aame  aa  that  of 
Equation  (2.14)  and  la  unchanged  by  the  converalon  of  the 
detezmlnlatlc  optimal  control  problem  to  the  etochaatlc 
optimal  control  problem. 

3.  The  beat  eetlmate  6(t)  of  the  atate  x(t)  given  the  meaeure 
mente  Z(t)  la  given  by 

i  -  F(t)$(t)-  G(t)u(t)  ♦  ?(t)HT(t)W-1(t) 

( * (t)  -  H(t)$(t)  ]  ;  *(te)  -  *0  (2.21) 

where  the  n  x  n  ayanetrlc  matrix  aatlaflea  the  matrix 
klceatl  equation 

P  -  F(t)P(t)  +  r<t)l*(t)  -  P(t)HT(t)w‘l(t)H(t)P(t)  , 
*<*o>  "  Po 


(2.22) 


If*  la  addition,  (V,H)  consltutes  an  observable  pair; 
l.a. ,  if 

T 

/  *T(t,t0)HT(t)H(t)  *  (t,to)dt  >0  (2.23) 

'o 

then  P(t)  exists  and  is  bounded  for  all  tc  |to»T j  . 

The  tvo  parts  (1)  and  (2)  illustrate  the  Certainty  Equivalence 
Principle,  which  emphasises  the  fact  that,  for  linear  systems  with 
qusdratic  cost  functions  and  subjected  to  additive  white  Gaussian 
noise  inputs,  the  optimal  feedback  solution  treats  the  conditional 
mean-state  estimate,  x(t),  as  the  true  state.  The  Separation  Theorem 
expresses  the  fact  that  this  problem  can  be  solved  via  two  separate 
problems;  optimal  estimation  and  control. 

2.3  PIPFEREMTIAL  CAME  PQPjUjATICM 

A  two  person  differential  game  differs  from  the  optimal  control 
problem  in  that  another  set  of  control  varlebles  is  available  for 
manipulation.  Each  set  of  control  variables,  u^(t)  and  u2(t),  can  be 
thought  of  as  being  under  control  of  an  intelligent  player  or  con¬ 
troller,  and  each  player  thus  has  control  over  only  some  of  the 
relevant  variables  that  decide  the  outcome  of  the  game.  The  players 
are  opponents,  and  if  the  objective  of  the  one  controlling  u^(t)  is 
to  minimise  the  cost  or  payoff  of  the  game,  the  objective  of  the  one 
controlling  u2(t)  is  to  maximise  it. 

In  general,  the  following  situation  arises  for  a  two-person 
sero-sum  game:  For  i  •  1,  2  player  1  wishes  to  select  his  p^  compo¬ 
nent  control  vector  u. (t)  that  optimises 

*  a  X 


16 


(2.24) 


I 

"  B(*(T),T)  +  J  r(x(t),u1(t),u2(t),t)dt 
fc0 


subject  to  the  constraints 

jj:  ■  *  ■  f (x(t),u1(t),u2(t),t)  ;  x(to)  -  xQ  (2.25) 

and 

;  UjtDj  (2.26) 

As  for  the  optimal  control  problem  there  auy  be  Inequality  constraints 

on  the  state  and  control  variables.  To  ensure  termination  of  the  game, 

the  terminal  time  T  Is  given  explicitly  in  the  above  game. 

The  control  variables  u^  and  u2  are  called  the  strategies  of 

player  1  and  player  2  respectively,  and  are  restricted  to  certain  sets 

of  admissible  strategies  and  U2>  which  depend,  In  general,  on  the 

specific  problem  to  be  solved.  Equations  (2.24)  through  (2.26)  can  be 

thought  of  as  defining  the  rules  of  the  game.  The  progress  of  the 

game  is  determined  by  the  n-flrst  order  differential  equations (2.25) . 

Play  starts  at  time  t  In  the  state  x_  and  terminates  at  time  t  ■  T. 

o  o 

The  game  la  zero-sun  because  there  Is  a  single  payoff  and  the  game  Is 
called  strictly  competitive,  furthermore,  the  game  Is  one  of  perfect 
Information  since  both  players  know  the  state  x(t)  at  any  time 
t  c |*0»T  |  •  In  the  case  of  a  two-person  nonzero-sum  gams  we  may 
encounter  a  payoff  function  such  as 


(2.27) 


W*o  ! 


VU2> 


X 

-  (x (T) ,T)  +  /  F1(x(t),u1(t),tt2(t),t)dt 


for  i  »  l,  2. 

Since  the  players  are  assumed  to  have  several  strategies  avail¬ 
able  for  play,  the  central  problem  of  game  theory  is  the  determination 
of  which  one  to  play. 

2.4  SOLUTION  CONCEPTS 

In  optimal  control  theory,  the  solutions  are  the  allowable 
control  functions  that  optimise  the  criterion  function  and  there  is 
no  doubt  about  the  meaning  of  a  correct  solution.  In  game  theory, 
however,  the  presence  of  the  opposing  control  Introduces  a  dramatic 
new  order  of  complication  not  usually  found  In  the  one-sided  optimal 
control  problem.  When  each  player  determines  his  optimal  strategy, 
he  must  also  take  Into  account  his  opponent's  actions  toward  the 
opposite  end,  the  opponent's  similar  wariness  of  the  other  player's 
actions,  and  so  forth.  The  basic  difficulties  are  thus  related  to 
the  available  Information  sets  and  the  rationales  used  by  each  player. 
In  nonzero -sum  games  one  can  be  faced  with  a  great  variety  of  relevant 
solution  concepts  Involving  coalitions,  threats,  enforceability  of 
agreements,  bargaining,  etc.  In  this  paper  we  will  explore  two 
solution  concepts  associated  with  nonzero-sum  games,  namely,  Nash 
equilibrium  and  Individual  mlnlmax  solutions.  .  In  two-person  sero-sum 
differential  games,  the  problem  of  multiple  solution  concepts  does 
not  arise. 


It 


2.4.1  Equilibrium  Solution; 

Zf  gam*  theory  is  to  recommend  any  specific  pair  of  atrateglee 
.for  a  two-person  game,  then  each  strategy  must  be  the  best  possible 
against  the  other  strategy  In  the  pair;  l.e.,  the  pair  must  be  an 
equilibrium  point.  Otherwise,  a  knowledgeable  player  will  know  what 
the  theory  recommends  for  the  other  player,  and  ao  will  want  to  select 
a  strategy  that  la  better  for  him. 

If  we  Identify  the  players  by 

Player  1;  minimising  player  with  control  u^ 

Player  2;  maximising  player  with  control  u2 
then  a  strategy  pair  (u^.u^*)  equilibrium  if 

Jl(ul*»u2#)  -  Ji(uru2#)  ^  "i€Di  <2,28> 

and 

J2  (ui*  ,u2>  S  J2(tti#,tt2*)  u2€D2  (2'29) 

In  other  words,  the  strategies  are  in  equilibrium  if  no  player  has  any 
positive  reason  for  changing  his  strategy  assunlng  that  the  other 
player  la  not  going  to  change  his  strategy.  In  game  theory  such  an 
equilibrium  solution  is  known  as  a  Nash  equilibrium  solution.  Thus, 
if  a  player  knows  that  the  other  player  Is  committed  to  his  equilibrium 
strategy,  then  he  has  reason  to  play  the  strategy  which  will  give  such 
an  equilibrium  pair  and  the  game  is  stable  In  the  sense  that  no  player 
can  unilaterally  Improve  his  payoff  by  changing  his  strategy. 

For  two-person  saro-sum  games,  the  Nash  equilibrium  solution 
leads  to  a  saddle  point  on  the  cost  surface  in  ths  control  space  and 


19 


(2.30) 


,J(ul*,u2)  <  <  J(«1»u2*) 

In  this  csss  equilibrium  pslrs  sre  both  lnterchsngesble  snd  equivalent, 
In  the  sense  thst,  If  (u£,u2)  snd  (u^*,u2°)  sre  equilibrium  pslrs, 
then  so  sre  (u^,u2*)  snd  (uj*,u2>  snd  moreover 

"  J(ui# »up  (2.31) 

This  well-known  result  of  equlvelence  snd  lnterchsngesblllty  [12] for 
sero-sw  gsmes  with  s  ssddle -point  solution  mskes  the  question  of 
uniqueness  of  the  sdmlsslble  strstegles  Irrelevsnt.  For,  If  two 
ssddle-polnts  exist,  their  vslues  sre  equlvslent,  snd  the  strstegles 
which  give  those  ssddle-polnts  could  be  plsyed  interchsngeebly  without 
chsnglng  the  vmlue  of  the  criterion. 

Unfortunstely,  not  every  gsme  hss  equilibrium  strstegy  pslrs. 

In  genersl,  if  s  gsme  hss  no  equilibrium  strstegy  pslrs,  we  ususlly 
see  the  plsyers  trying  to  outguess  esch  other,  keeping  their  strstegles 
secret.  This  suggests,  snd  Is  Indeed  true,  thst  for  finite  gemes 
with  complete  Inf ormst ion,  equilibrium  strstegles  do  exist. 

2.4.2  snd  Msxlmln  Solutions 

Most  prectlcsl  conflict  sltustlons  sre  not  gsmes  of  perfect 
lnformstlon  since  lgnorsnce  of  sn  opponent's  ultlmste  choice  of  con¬ 
trol  Is  generslly  sn  essentlsl  element  of  s  conflict  sltustlon.  In 
thst  esse  esch  plsyer  must  spprosch  the  design  of  his  own  control 
prepsred  to  limit  the  sdverse  cost  resulting  from  his  opponent's 


20 


ultimata  eholca  of  control.  This  Mans  that  tha  minimizing  player, 
player  1,  muat  aelect  ao  aa  to  mlalmlce  the  maximum  poaalble  coat, 
regardleaa  of  whether  the  maximizing  player,  player  2,  ultimately 
aelecte  u2  auch  aa  to  yield  thla  coat. 

Bence,  from  player  l 'a  point  of  view,  If  he  aelecte  an  erbltrery 
control  u^,  than,  regardleaa  of  the  choice  of  player  2,  he  la  aaaured 
of  tha  coat  being  at  moat 

Jl(VU2-)  “  “**  Jl<V*2)  (2*32) 

U2 

Since  player  1  la  the  minimizing  player,  he  will  aelect  auch  that 
thla  choice  mlnlmlcea  the  maximum  coat,  that  la. 


JjGij/.Uj*)  •  min 

ttl 


max  Jj^Uj.Uj) 
U2 


(2.33) 


Thua,  the  mlnlmax  aolutlon  la  the  control  u^*.  Player  1  doea  not  care 
what  atrategy  hla  opponent  ultimately  aelecta,  he  la  that  much  ahead 
If  hla  opponent  aelecta  any  atrategy  other  than  u^*,  alnce  the  reault- 
lng  coat  would  be  leee  than,  or  at  beat,  equal  to  Jj(uj*,u2*): 

Jl(ul’’U2)  <  VV’V*  (2.34) 

Hence,  J^(uj#,u2#)  la  tha  loaa  cel line  or  the  aecurltv  level  for 
player  1. 

Prom  the  point  of  view  of  player  2,  if  he  aelecta  an  arbitrary 
control  u2»  then  regardleaa  of  the  control  of  player  1,  he  la  aaaured 
of  the  coat  being  at  leaat 


21 


(2.35) 


J2*ui*'u2*  "  mia  J2^U1*U2^ 

U1 

and  since  he  Is  the  maximizing  player,  he  will  select  Uj  such  that 
this  choice  maximises  the  minimum  cost;  l.e., 

-  max  aln  J2^ul,u2^  (2.36) 

U2  U1 

Thus,  the  max  lain  solution  Is  the  control  u^*.  Player  2  also  does 

not  care  what  strategy  his  opponent  ultimately  selects,  since  If  his 

opponent  selects  any  strategy  other  than  Uj*,  the  resulting  payoff  to 

.  *  *. 

player  2,  the  maximising  player,  will  be  greater  than  32^ul  ,u2 

<  Jj (ui»u2*)  (2.37) 

Hence,  Is  the  sain  floor  or  security  level  for  player  2. 

The  controls  u^*  and  u^*,  derived  on  the  basis  of  no  a  priori 
knowledge  of  each  opponent's  ultimate  choice,  are  again  stable  solu¬ 
tions  to  the  game.  Assume,  for  example,  that  during  a  differential 
game,  player  2  has  calculated  his  security  level  by  which  he  deter- 
mined  the  control  set  |u^*(t)  ,u2*(t)  j  and  subsequently  found  out 
that  player  1  uses  the  strategy  u^*(t).  Then,  player  2  will  be  able 
to  find  another  strategy  u2'(t)  which  will  give  a  payoff  greater 
than  J2(uj*,u2*).  However,  as  soon  as  player  2  employs  a  strategy 
other  than  u2*(t),  there  exists  a  strategy  U|'(t)  that  together  with 


22 


Uj(t)  gives  a  payoff  such  that  J2<u£,  up  <  J2^ul,u2  ^ ‘  **“* 

player  1  decides  to  secretly  svlteh  to  uj(t),  player  2  has  to  accept 
a  smaller  payoff  than  If  he  had  stayed  with  u2*(t)  In  the  first  place. 
Thus,  unless  player  2  has  reason  to  believe  that  player  1  Is  Irrevoca¬ 
bly  c omnl ted  to  a  strategy  other  than  (t) ,  there  Is  no  reason  at 
all  to  play  a  strategy  other  than  u2  (t). 

If  a  player  reveals  his  strategy  to  his  opponent  the  best  he 
can  hope  for  Is  the  loss  celling  or  the  gain  floor  depending  on  whether 
the  revealing  player  Is  player  1  or  2.  For  a  two-person  sero-sum  game 
if  It  happens  that  ■  Uj*  and  u2  ■  u2*,  the  mlnlmax  and  maxlmln 
solutions  have  located  the  familiar  saddle  point  solution  and  there  Is 
no  point  to  secrecy. 

2.4.3  Open-Loop  Versus  Closed-Loop  Control 

The  fact  that  a  player  plays  a  maxlmln  or  a  mlnlmax  strategy 
does  not  Imply  that  he  cannot  take  advantage  of  any  non-optlmal  play 
of  his  opponent.  In  fact,  the  Interim  action  of  his  opponent  during 
the  actual  play  of  a  differential  game  can  not  be  Ignored,  and  what  la 
required  are  controls  that  depend  explicitly  on  the  state  x(t)  of  the 
game. 

The  Indifference  between  open-loop  and  closed-loop  control,  as 
In  the  deterministic  one-sided  control  problem,  has  Its  counterpart  In 
differential  games  only  In  the  determination  of  a  priori  strategies, 

In  which  case,  u^ft)  »  ul(x(to>,t).  During  the  actual  play  of  the 
game  It  la  mandatory  that  closed-loop  control  is  used  and  u^(t)  ■ 
u^(x(t),t).  Starr  l 13]  has  shown  that  for  nonsero-sum  differential 
games  the  open-  and  closed-loop  equilibrium  formulations  give  entirely 


23 


CHAPTER  3 


THE  LINEAR-QUADRATIC  PERFECT  INFORMATION  GAME 

la  this  chapter  we  develop  the  solutloa  to  e  differential  game 
with  perfect  state  Information  which  is  of  fundamental  Importance  to 
the  delayed  commitment  strategy  solutions  of  stochastic  differential 
games  discussed  In  later  chapters. 

The  currently  available  control  literature  shows  that  a  closed- 
loop  solution  for  a  stochastic  optimal  control  problem  aeems  to  be 
available  In  closed-form  only  In  the  special  case  of  a  linear  system, 
a  quadratic  cost  functional  and  white  Gaussian  noise  eddltlvely  cor¬ 
rupting  the  system.  It  therefore  seems  unlikely  that  a  closed-form 
solution  for  a  stochastic  differential  game  problem  will  be  available 
unleas  we  assume  the  same  or  more  stringent  restrictions  for  such  a 
game  problem.  Since  stochastic  differential  games  will  became  our 
mein  Interest,  we  will  restrict  our  discussion  In  this  chapter  to  a 
linear  system  with  a  quadratic  payoff  functional.  Contrary  to  pre¬ 
viously  obtained  results  (14]  ,  ( 15 1  ,  however,  our  solution  will  not 

be  conditioned  by  the  a  priori  assumption  of  a  saddle  point  solution. 

The  llnear-quedratlc  differential  game  representation  used  In 
this  paper  Is  formulated  and  the  solution  Is  obtained  using  function 
space  methods.  Thus,  the  analysis  Is  made  In  Hilbert  space  and 
follows  the  method  of  approach  of  Porter  In  ( 16 1  .  It  Is  then  shown 
that  the  optimal  strategies  can  be  obtained  from  a  matrix  Rlceatl 
equation  and  can  be  computed  prior  to  the  actual  game. 


25 


3.1  LDHAR-OUADRATIC  GAME  FDHMULATICM 

Consider  the  linear  continuous -else  ay* tea  governed  by  the 


.vector  differential  equation 


(3.1) 


*'  (t)  -  77  -  F’ftjx'ft)  -  G{(t)u'(t)  ♦  Cj(t)u£(t);  x*(to)  -  *; 


where  the  n  vector  x' (t)  Is  the  system  state;  the  control  vectors  u|(t) 
end  u£(t)  ere  of  dimension  p  and  q,  respectively;  and  the  matrices 
P'(t),  G{(t)  and  GjCt)  have  the  appropriate  dimensions.  Consider  also 
a  quadratic  cost  (or  payoff)  functional 

T 

J(ultu2)  -  1/2  x'TCI)Q3x'CT)  +  /  u{T(t)Q1(t)uJ(t)dt 


T 

-  j  u2T(t)Q2(t)u^(t)dt  j 


(3.2) 


where  the  matrices  Q^(t),  (t)  and  are  symmetric  positive  definite; 

and  the  final  time  T  Is  fixed  and  finite. 

The  payoff  functional  can  be  written  more  efficiently  by  use  of 
the  following  transf onset Ions.  Since  Q^(t),  Q^(t)  and  Qj  are 
positive  definite  and  symmetric,  they  may  be  factored  as 


1/2T  1/2 

Qt  -  Qj'  v' 


l  -  l,  2,  3 


(3.3) 


Then  by  the  transformations 


26 


*•(0  ►  o"l/2x(t) 

U[(t)  —  ^l/2(t)«!(t)  (3.4) 

u’(t)  —  Q‘l/2(t)u2(t) 

The  eye  teat  equation  becomes 

*  -  Qj/2F'(t)Q‘1/2*(t)  -  dl,2G[(t)oi 1/2 (t)Ul(t) 

+  o5/2G'(t)Q'l/2(t)u2(t);  *(to)  -  QJ/2*;  -  xo  (3.3) 


If  we  now  define  the  new  aatrleee 


F(t)  -  Q3/2F'(t)Q3 


-1/2 


®i(t)  ■  ^/2®i<t>^l/2<t) 


(3.6) 


C2(t)  $  Qj/2G^(t)Q’l/2(t) 


the  system  equation  and  payoff  functional  are  respectively 

x  -  -  F(t)x(t)  -  Gl(t)ul(t)  +  c2(t)u2(t);  «(t0)  -  x( 


J(ultu2)  ■  1/2 


(3.7) 


(3.8) 


xT(T)x(T)  +  f  |v1T(t)ttl(t)  -  u2T(t)u2(t)Jdt 


27 


In  view  of  the  possibility  of  Baking  th«  above  trana format lone,  ve  will 
consider  Equation  (3.7)  as  the  defining  system  equation  and  Equation 
(3.8)  as  the  payoff  functional. 

The  above  formulation  involves  a  single  dynamic  system  Instead 
of  the  tvo  separate  systems  of  the  pursuit -eves ion  problem  as  in  (14)  . 
However,  this  single  system  includes  the  pursuit -eves ion  problem  as  a 
special  case,  since  the  individual  state  vectors  of  the  pursuit-evasion 
problem  can  be  combined  into  a  single  state  vector  and  the  two  oppon¬ 
ents  considered  to  constitute  a  single  system. 

Player  1,  the  minimizing  player,  attempts  to  minimize  the  payoff 

T 

functional  or  criterion;  l.e.,  he  minimizes  the  ten  x  (T)x(T)  in 
Equation  (3.8)  as  well  as  his  own  expended  energy,  while  maximizing  the 
energy  expended  by  player  2.  Player  2,  the  maximizing  player,  attempts 
to  maximize  the  same  criterion.  Thus,  the  game  is  zero-sum  and  since 
each  player  is  asstmed  to  have  perfect  knowledge  of  the  system  state 
it  is  more  accurately  a  zero-sum  game  with  perfect  information. 

The  class  of  admissible  strategies  are  defined  as  those  and 
U2  which  give  rise  to  the  controls 


Ul  :  U1  "  ui  <*•*<*» 
U2  :  u2  "  u2 (*•*(*)) 


(3.9; 


that  are  bounded  and  that  are  continuous  almost  everywhere  for  t£t  <  T. 
It  is  well  known  that,  for  arbitrary  t  •  t  ,  x  .  u. (t)  and  u. (t), 

O  O  1  d 

the  solution  to  Equation  (3.7)  may  be  written  as 


28 


x(t)  -*<t,t0)x(t0)  -  f  #  (t,T)Gl(T)u1(r)dT 


+  /#(t,T)G2(T)u2(T)dT 


(3.10) 


where  #(t,to)  Is  the  stats  transition  matrix,  l.a.  It  satisfies  the 
relation 


0*  (t,t  ) 

- —  -F(t)#(t,t) 


*  *  1 


(3.11) 


As  mentioned  previously  «c  will  analyse  the  above  problem 
using  functional  analysis  techniques,  although  any  other  of  the  four 
basic  methods  of  approach  mentioned  In  Section  2.2  could  have  been 
used. 

To  reformulate  the  differential  game  In  a  suitable  Hilbert 
space  consider  the  controls  u^(.)  end  u2(>)  to  be  elements  of  the 
Hilbert  spaces  ■  L2P|to,Tj  and  Hj  ■  | tQ,Tj  respectively,  where 

the  space  | J  space  of  r  -  vector  functions  which  are 

defined  and  (Lebesgue  -  )  square  integrable  over  the  Interval  |  tto,T J . 
The  Inner  product  on  this  space  Is  defined  as 


29 


<«,  y  >  - 


(3.12) 


T 

f  *T(t)y(t)dt 
*o 

and  the  norm  la  defined  In  terms  of  the  Inner  product  aa 

T 

Hy|l 2  -  <y.  y  >  -  /  yT(t)y(t)dt  (3.13) 

*0 

Hence  the  two  Integrals  In  Equation  (3.10)  may  be  considered  as  linear 
operations  on  u^  and  u2  respectively,  and  we  can  repreaent  the  dynamic 
system  (3.7)  In  terms  of  linear  transformation  on  suitable  Hilbert 

spaces  as 

n(t)  -  #(t)xo  -  CT^Xt)  +  (T2u2)(t)  (3.14) 

where  the  linear  operator  LjP |tQ,T j  —  En  la  defined  by 

t 

Cr^XO  -  J  #(t,r)cl(r)u1(r)dT  (3.15) 

‘o 

with  a  similar  definition  for  T2  and  Ea  Is  the  n-dlmenslonal  Euclidean 
space.  The  terminal  state  can  then  be  written  as 

x(T)  •  #  (T)*o  -  (TjUjXT)  +  (3.16) 

Dropping  the  argument  T  whenever  t  ■  T,  the  first  term  of  the  payoff 
functional  (3.8)  may  then  be  written  as 

*TCT)*(T)  -<#*0  -  Tlul  ♦  I2«2»**o  ‘  T1U1  +  T2U2>  <3'l7) 


30 


Th*  other  ton*  of  the  payoff  functional  nay  similarly  be  expressed 
as  Inner  products  and  we  can  write  the  payoff  functional  as 


JCuj.Uj)  •  1/2  |  <  #  XQ  -  TjU^  +  TjUj,*^  -  TjUj  +  T2u2  > 


+  <  Vul>  '  <u2»u2> 


0.18) 


which  now  includes  the  dynamic  system  since  It  has  been  used  to 
develop  this  equation. 

3.2  MXHIMAX  SOLUTION 

For  the  minlmax  solution  we  have  to  find  the  u2*(t)  that 
maximizes  (3.16)  for  arbitrary  u^(t)  and  then  that  u^* (t)  that 
minimises  this  maximum  coat. 

Forming  the  functional  derivative  of  JCu^,^)  with  respect  to 
u^  and  setting  this  derivative  equal  to  zero,  we  obtain 


aj(u1>U2) 


-  -  u2  +  T2%xo  -  T2*Tlul  ♦  T2*T2u2 


(3.20) 


(where  the  asterisk  denotes  the  adjoint  operatox)  or 

U2  ‘  '  T2*Tl"l  +  T2*T2"2  (3'1,) 

The  above  equation  requires  u2  to  be  in  the  range  of  T2*,  thus  we 
may  write 


Making  this  change  of  variable  results  In 


V  *2  '  '  VVl  +  T2*T2T2*  X2 


(3.22) 


which  will  hold  whenever 


X2  “*Xo  "  Tlul  +  T2T2  X2 


(3.23) 


X2  "  ^  T2T2  *  ^*x0  “  Tlul^ 


(3.24) 


Thus,  vhenever  the  Indicated  Inverse  exists,  the  cendldete  extremal 


control  u2*  Is 


U2*  *  T2  (1  “  T2T2  >  (*Xo  *  Tlul) 


(3.25) 


With  T  :  L2  |to,Tj—  En  defined  es  In  Equation  (3.15)  by 


(T»0(t)  -  j  ♦  (t,T)G(T)u(T)dT 


(3.26) 


the  Inner  product  Tu  In  En  with  an  arbitrary  vector  f€En  Is 

t 

<t,Tu>»  t,  J  •  (t,T)G(T)u(T)dT 
t 

o 

t 

■  /  ft.  •(t,r)G(r)u(r)  j  dt 


(Coat'd) 


¥ 


I 

■  J  |g*(t)  **(t,T)  t,u(r)  ]  dr 


-  <  I  t,u> 


*  * 

Hence  the  adjoint  operators  and  T2  ere  Identified  by  the 


0.27) 


equations 


(*!**!>  (O  -  #T{rfe) 


(3.28) 


cr2*t2)(t)  -  c2T(t)#T(T,t)  t2 


(3.29) 


Thus,  Equation  (3.25)  can  be  written  as 


T 

u2#(t)  -  G2T(t)*TCl.t>  I  -  f  ♦CT,t>G2(t)G2T(t)*T(T,t)dt  “l 

'o 


T 

#(T,to)*o  -  J  #(T,t)Gl(t)ul(t)dt 
t 


(3.30) 


For  u*(t)  to  be  indeed  locally  maximizing  - 8  y-  <  0  must  be 

«°2 

satisfied.  Differentiating  Equation  (3.19)  with  respect  to  u2  gives 


■I  *  T2  T2  <  0 


(3.31) 


33 


thus  requiring  thst 


T 

I-  J  *T(t,r)G2T(r)G2(r)#(t,r)dT  >  0,  tQ<t<T  (3.32) 

In  addition,  no  conjugate  points  say  exist  on  the  extremal  path,  which 
is  equivalent  to  requiring  that  (I  -  TjT^*)  >  0 

or 

T 

I-  j  #(t,T)G2(r)c2T(r)#T(t,r)dT  >  0,  to<t<T  (3.33) 

to 

which  assures  the  existence  of  the  Inverse  in  Equation  (3.30)  so  u2*(t) 

exists  over  the  entire  Interval  It  ,T  1.  If  Equation  (3.33)  Is  not 

o 

satisfied;  l.e..  If  there  exists  a  tine  t#  <  T  for  which  the  matrix 

T 

I  *  /*(t,T)G2(r)G2T(r)#T(t,r)dT  (3.34) 

fco 

becomes  singular,  then  the  control  u2*(t)  is  no  longer  maximizing  for 

t  >  t  . 

s 

Assuming  Equation  (3.32)  and  thus  Equation  (3.33)  to  hold,  the 
maximizing  control  u2*(t)  for  arbitrary  u^(t)  is  given  by  Equation 
(3.25)  or 

u2*  -  I2*(I  -  T2T2Vl(#*e  '  Tlul) 

•  T2*D2^*xo  “  Tiul^  0*35) 

.* 

34 


vhoro 


®2  -  o  * 


0.36) 


Subatltutfag  „2*  tat.  gt™.  th.  teUaitng  p<joff  fuoctUMi 

J<"1>  ‘  ^<4*.  *  V.  *  VkV*«.  -  »lVl#.  .  T  u 

*  1  O  11 

2  2  TlUl^<ui»ttI>-<T2*D2(#xe-  Tju1)»T 


<#Xo  *  Vl»| 


(3.37) 


•htch  oiapllfioo  of  tor  i<«  work 


J<V  -  ?  I<*".  -  V,V«-„  -  Vl»*<»l.«l>|  (3.38) 

-  U  V*>  ««  thl.  payoff. 

forafag  th.  fi.actfo.ul  d.rl„t£„  ef  Jfo,)  rmma  „  , 
"tttn«  thl*  4«rlv..lv.  t.  ,.r.  ,tv..  1 


dux  "  U1  *  T1  D2^**o  *  J1UP  ■  0 


(3.39) 


“j  *  -  *1  Vlul  ♦  *,%  •*. 


0.80) 


**  thl#  •quatlon 


•rito 


wquir*.  co  bo  in  eho  root,  of  T*,  thu.  «* 


V-Tl*Al 


O.Al) 


33 


and  after  Baking  this  change  of  variables  ve  obtain 

*1*X1  ■  ViVi**!  +T1*II2»,0 


(3.42) 


which  will  hold  whenever 

'  D2T1T1*  Xl  ♦  D2  **0  (3.43) 

or 

X,  -  [l  +  a  -  IjTj  )  |  l[x  -  Ijtj  |  l**0 

•  I1-Vl*  +  Vl*|-I.v  (3.44) 

Thus,  the  nlnlmex  control  for  player  1  is 


-  Tl*D#xo  (3.45) 


where 

D  -  |l  +  TjTj*  -  T2T2*  |  ’l  (3.46) 

The  Indicated  Inverse  exists  If  (I  -  TjTj*)  >  0  as  required  for  u2*(t). 

Substituting  Into  Equation  (3.35)  gives  as  the  corresponding 
opt Inal  control  for  player  2. 

u2*  "  T2%<#xo  "  TlTi*D#xo> 

-  T2*D#x0  (3.47) 


36 


evaluation  of  (3.18),  using  (3.45)  and  (3.47)  yields  tha  alnlaax 


cost  froa  tlaa  tQ  to  conplstion  at  tlaa  T  as 


J(ul’'u2*)  ’  I  <*V  I1  +  *1*1*  -  **o> 


o.*a> 


*  * 


With  T^,  Tj  and  #  T2  defined  by  equations  (3.15)  end  (3.28), 
(3.29)  respectively,  the  alnlaax  solution  Is  froa  equations  (3.45)  and 


(3.47). 


T 

V(t)  •  cjw^d.t)  i  +  /  ♦(T.r)cl(T)6lT(T)  #Tcr.T)dr 

t0 

T 

-  /♦(X,T)C2(T)C2T(T)#T(T,T)dr  "l#(T,to)x(to)  (3.49) 
co 


T 

m  G27(t)*TCr,t>  1  +  J  •(T,T)Cl(T)ClT(T)#Ta.T)dT 

*o 


T 

*  /  ♦(T,T)C2(T)C2T(T)tTCI.T)dT  _1 
*o 


♦  CT.t  )x(t  >  (3.50) 

o  o 


The  alnlaax  cost  to  coaplete  the  process  froa  the  arbitrary 

tin  tQ  is  froa  equation  (3.48) 

* 

T 

“  2  *T(io>  •I<*»*o>  1  +  /  •CT,t)c1(t)c1t(t)  tTa,r)dT 

^6 


(Cent *d) 


37 


-  /  #(T.r)C2(t)c2T(r)  •Tcr,r)drrl#(r.t0)*(t#)l  (3.51) 


The  accessary  and  sufflelant  condition  for  the  existence  of 
the  alniaax  solution  is  from  Equation  (3.34) 


I-  y  *CT»T)C2(T)C2T(T)*Ta,T)dT>0,  to<T<_I  (3.52) 


For  the  sextain  solution  it  is  required  to  find  the  Uj  (t) 
that  ■lniatses  (3.17)  for  arbitrary  u2(t)  and  then  that  u2*(t)  that 
aaxlalses  this  Minima  cost. 

rontlng  the  functional  derivative  of  J(u^«2),  i.e.,  Equation 
(3.18),  with  respect  to  u^  and  setting  this  derivative  equal  to  sero 


ve  obtain 


ftj(u1(u2) 


(3.53) 


*1  *  Ti***o  *  TiVi  “  T1*T2°2  ’  0 


U1  “  Ti**Xo  ’  T1*T1U1  +  T1*T2u2 


(3.54) 


This  equation  requires  u^  to  be  in  the  range  of  ,  thus  ve  saj 


vrlte 


ui**i  xi 


(J.53) 


38 


39 


which  will  hold  whenever 


*2  -  diT2T2#X2  4  Dl*\>  0.68) 

or 

Aj  -jl  (I  +  ViV1t2T2*  ]  'l(I  ♦  T^V1  ♦  *,. 

-|l  ♦  If*  -  V/I-Ve.  0.69) 

Hence  the  eaxlwln  control  for  player  2  le 

•  *  i  *  *1-1 

UJ  -tj  j  I  +  T1Tl  -  T2T2  )  l*V 

-  I2*0*»o  0-70) 

Substituting  u^lnto  Equation  (3.59)  gives 

U1*  "  T1*D1*X0  4  Ti*DlT2T2*D#*o 

-  Tl*D#*o  (3.71) 

Evaluation  of  (3.17),  using  (3.70)  and  (3.71)  yields  the 
■ax lain  cost  frow  tlM  tQ  to  cowpletlon  at  tlwe  T  as, 

J(u4,uj)  -  |l  ♦  TjTj*  -  T2T2*  )  _l t*0>  0.72) 

With  Tj,  T2  cod  Ij#,  Tj*  defined  by  Equations  (3.15)  and  (3.28),  (3.29) 
respectively,  the  solution  la  frow  equations  0.70)  and  (3.71) 


41 


T 

•^(t)  -  clI(t)#Icr,t)  i  ♦  /  •(t»r)Gl(r)GlT(T)  #T(T,T)dT 


T 

-  f  #CT,t)c2(t)g2t(t)  #Tci,T)dr  'l#Cr,t0>*0 

*o 

I 

u2*(t)  -  G2T(t)  #TCr.t)  I  ♦  /  •(T.r)Gl(T)ClI(T)#T(T.r)dr 

'o 

T 

-  /  ♦Ci.T)G2(r)G2T(r)  tT(T,T)dT  *x  #cr.to)*o. 


(3.73) 

©  o 


0.74) 


The  — xlatn  cost  to  complete  the  process  from  the  erbltrery 
tiae  tQ  Is  frost  Equstlon  0*72). 

T 

J(U1»U2)  “  2  '  xT(to)#TCr»to)  I+/  ♦<T.T)C1(T)GlT(T)#TCT,T)dT 

to 

1 

T 

-  /♦a,T)G2(r)G2T(T)*T(T,T)dT  _l  ter. te)x(to)  0.75) 


while  the  necesserj  end  sufficient  condition  for  the  existence  of  the 


aaxlaln  solution  Is  froa  Equstlon  0.49) 


I  ♦  f  ♦0,T)Cl<T>GlT<T>#T(T.T)dT  -  j  ♦Cr»T)G2<T)G2T(T)*T(X,T)dT  >  0 


‘0S‘JT 


0.76) 


42 


3.4  DISCUSS10K 


Comparison  of  ths  mlnimax  solution  (Equations  (3*49)  through 
(3.52)  with  ths  max  lain  solution  (Equations  (3.73)  through  (3.76)) shows 
that  tha  solutions  are  Identical.  Hence,  we  have  obtained  the  saddle 
point  solution  to  the  two-person  sero-sua  gaae,  l.e., 

J (u^.Uj)  £  J(Uj*,u2*)  £  J^ui»u2*)  0.77) 

If  we  define  the  symmetric  aatrlces  M(T,tQ),  M^(T,to)  and 
MjCT.t^  as 

K(T.t0)  -  I  +  Kjd,^)  - 
T 

Ml(T,to)  "  /  *Cr,T)Cl(T)GlT(T)#T(T,T)dT  (3.78) 

Co 

T 

-  J  ♦CT»T)G2(T)G2T(T)  *1(T,T)dT 
*0 

we  can  write  the  optimal  solutions  as 

«l*(t)  -  ClT(t)*T(T,t)M'l(T,te)#Cr,to)xo  (3.79) 

u2*(t)  -  C2T(t)*T(I,t)M“l(T,t0)  ♦CT.t0)xo  (3.80) 

and  It  Is  obvious  that  the  optlaal  controls  are  proportional  to 
+  (T»t0)*0  which  Is  the  tarsias!  alee  If  both  controllers  remain 
Inactive  and  the  system  Is  allowed  to  run  free.  The  time  varying 
matrices  reflaet  the  control  capabilities  of  both  players. 


43 


Ft  am  optimal  control  theory  (Section  2.2), 


know  that  the 


necessary  end  eufflclent  condition  for  the  system  to  he  controlleble 
.on  |t0»lj  controller  1  with  u^* (*)■()  i« 

T 

KiCl.t©)  -  /  ♦(T,t)Cl(t)C1T(t)  #T(T,t)dt  >  0  (3.81) 


for  ell  t  In  jt^Tj  *  while  the  necessary  end  sufficient  condition  for 
the  system  to  be  controlleble  on  |  j  by  controller  2  with 
u  •(•)■<>  la 

T 

HjCT,^)  -  /#( T.t)G2(t)G2T(t)#Tcr,t)dt  >  0  (3.82) 


and  we  can  define  M^(T,tQ)  and  Mj(T,to)  ea  the  reduced  controllability 
matrices  of  player  1  and  player  2  respectively. 

The  conditions  for  the  existence  of  M”*(T,to)  obtained  In  the 
maxlmln  solution  provides  additional  Insight  Into  the  problem  If  we 
consider  the  limiting  case  of  weighing  the  importance  of  terminal 
miss  against  control  effort.  In  this  case  the  payoff  functional  Is 
written  as 


where  the  scalar  a  permits  the  required  weighting,  and  the  resulting 
11(1,0  Is  then 

O 

*(T,t0)  -  ♦  MjOr.t^  -  ^(T,^)  (3.84) 

44 


2 

la  eha  Halting  ease,  1.*.,  a  — *  •»  in  tha  aeasa  that 

'  0  If  x(T)  -  0 

a2  I 

(*TCr)*Cr))  -  (3.85) 

,  •  If  x(T)  4  0 

the  existence  of  M(Tfto)  la  guarantaad  If 

>  0  (3.86) 

la  known  as  tha  ralatlva  controllability  matrix,  ( 14  )  the  fact 
that  It  la  positive  definite  Indicates  that  the  minimising  player, 
player  1,  Is  'Wore  controllable"  than  the  maximising  player,  player  2. 

The  Initial  tlae  tQ  Is  completely  arbitrary,  while  the  assump¬ 
tion  of  perfect  Information  guarantees  that  x(t)  Is  available  for  any 
t.  Hence  the  open-loop  controls  can  be  applied  continuously  and 
instantaneously  to  yield  optimal  feedback  control  laws  by  replacing 
to  By  t. 

If  we  define 

T 

S(t)-#T(T,t)  1+  f  •(T.T)Gl(T)G1T(T)#TCr,T)dT 

t 

T 

-  /  #(T,T)G2(t)G2T(t)*TCT.T)  ’l  #cr,t)  (3.87) 

t 

then  we  can  write  the  optimal  feedback  controls  for  the  Unear- 
quadratlc  two-person  differential  game  as 


45 


(3.88) 


ui  ^ 

u2*(t)  -  C2T(t)S(t)x(t)  (3.89) 

and  the  optimal  coat  Co  complete  the  gene  froa  the  erbltrary  time  t  la 

J(ui*»u2*>  •  J  *T(t)S (t)x(t)  (3.90) 

while  the  necessary  and  aufficlent  condition  for  existence  of  the 
solution  Is  from  the  maximizing  step  of  the  minlaax  solution 
T 

I  *  /♦CT,T)G2(T)G2T(T)*T(T.T)dr  >  0;  t  <  T  £  T  (3.91) 

t 

This  necessary  end  sufficient  condition  Is  more  stringent  than 
that  of  Ho,  Bryson  and  Baron  (14]  and  Rhodes  and  Luenberger  f 10  J  who 
claia 

T 

1+  /♦(T.r)Gl(T)GlT(T)#T(T,T)dT 
t 

T 

-  J’#(T,T)C2(t)G2T(t)  #T(T,T)dT  >  0;  t<Ti  T  (3.92) 
t 

as  the  necessary  and  sufficient  solution.  The  difference  oceurs 
because  Chair  aathaaatlcs  Is  condltlonsd  by  the  a  priori  assumption  of 
a  saddle  point  solution. 

With  S(t)  defined  as  In  Equation  (3.87)  ve  can  dateralne  the 


46 


controls  Uj*(t)  end  u^* (t)  from  the  aatrlx  llccstl  equation,  developed 
below.  Taking  the  derivative  of  S(t)  we  obtain 


i(t)  -  ♦I(T,t)D(t)#(T,t)  +  #T(T.t)  D(t)#(T,t) 

♦  #T(T,t)D(t)  ♦  (T,t)  (3.93) 

where 

T 

D(t)  -  I  +  /  ♦(T.r)C1(r)GlT(T)#T(T,r)dT 
t 

T  1 

-  /*(T.T)G2(T)#T(T,T)dT  "l  (3.94) 

But 

-~*TCT,t)  *‘l(t,T)  T-  J*‘1(t,T)-~  #(t,T)  *"l(t,T)  T 

J  L 

-  -  rT(t)#Tcr,t);  (3.95) 

*  ex.t)  -  *’l(t,T)  -  -  #(T,t)F(t)  (3.96) 

end 

-^J-D(t)  -  -  D(t)  -~-P*l(t)D(t)  -  D(t)  f#(T,t)G1(t)GlT(t)#Ta,t) 

-  ♦(T,t)C2(t)G2T(t)#TCr,t)]  D(t)  (3.97) 


Substituting  Squat Ions  (3.95)  through  (3.97)  Into  Equation  (3.93)  we 
obtain  the  aatrlx  llccstl  equation 


47 


(3.98) 


S(t)  -  -  S  (t)F(t)  -  1*0)8  <t)  ♦  8(t)  |  ^(06^(0  -  G2(t)G2T(t)]  S(t) 
with  boundary  condition 

8(T)  -  I  0*99) 

Note  that  the  aolutlon  to  tha  above  equation  can  be  obtained  prior  to 
the  actual  game.  A  sunnary  of  the  optimal  atrategiea  for  the  linear- 
quadratic  differential  game  with  perfect  information  la  presented  In 
Table  3.1. 


48 


TABLE  3.1 


SUMMARY  OF  OPTIMAL  DETERMINISTIC  STRATEGIES 

x  -  FCt)x(t)  -  G^OiijCt)  +  G2(t)u2(t);  x(to)  -  xq 

Player  1:  Per fact  aeasurements 

Player  2:  Perfect  ■eaeureaenta 

T 

xTCT)x(T)  +  /  (u^COu^t)  -  u2T(t)u2(t)J  dt 
fco 

u^CO  -  GjCOSCOxCt) 

*  ®2<t)S  (t)x(t) 

S  -  -  SP(t)  -  T* (t)S  +  S  [c^OG^t)  -  G2(t)G2T(t)J  S;  S(T)  - 

J(Ul\u2*)  -}xT(t)S(t)x(t) 

Neceeeary  and  aufflclent  conditions 
T 

I  -  f  *(T,r)G2(r)G2T(r)*T(Ttr)dr  >  0 

t 


AS 


CHARS.  4 


INTRODUCTION  TO  STOCHASTIC  DIFFERENTIAL 
GAMES  AMD  DELAYED  COMMITMENT  STRATEGIES 

Tor  the  differential  gene  considered  so  far,  ve  have  aaetaeed 
that  ve  could  sake  noiseless  measurements  of  the  system  state  vector 
and  use  those  measurements  in  the  system  mechanization,  i.e.,  ve 
assumed  a  differential  game  vith  perfect  information. 

In  many  practical  situations,  however,  the  players  have  access 
only  to  noisy  measurements,  resulting  in  a  gave  vith  laperfect  infor¬ 
mation.  Wi liman  (8]  has  given  a  formal  solution  to  this  class  of 
games,  but,  as  an  apparent  consequence  of  this  laperfect  information, 
attempts  to  express  these  strategies  in  terms  of  flnlte-dlaenslonal 
estimate  vectors  have  been  unsuccessful.  A  version  of  this  gave  in 
which  constraints  are  placed  on  the  player's  state  estimators  has 
baen  solved  by  Rhodes  and  Luenberger  I 17 1  . 

A  subclass  of  games  vith  imperfect  information  where  one  of  the 
player's  measurements  are  corrupted  by  white  noise  and  the  other 
player  has  perfect  measuresMnts  was  solved  in  1968  by  Behn  and  Ho  [91 
for  a  pursuit -evasion  game  and  in  1969  by  Rhodes  and  Luenberger  [10] 
for  a  more  general  game . 

Harsanyl  ( 18 1  ,  in  1967/1968,  used  a  chance  move  as  a  mathemat¬ 
ical  device  in  the  analysis  of  static  games  vith  imperfect  information 
to  reformulate  the  game  into  a  game  with  perfect  information,  called 
the  "Bayss-equlvalent"  of  the  original  game.  The  players  enter  the 
game,  so  to  speak,  after  chance  has  made  its  choice.  In  part  II 


50 


{19]  ,  Itrunyl  ncoplui  that  this  time  gap  is  crucial  whan  coop¬ 
erative  gases  vlth  taper feet  information  are  being  played  and  shows 
that  the  normal  form  of  a  Bayesian  game  Is, in  many  cases,  a  highly 
unsatisfactory  representation  of  the  gaae  situation.  He  argued  that 
the  Bayesian  gases  sust  he  interpreted  as  gases  with  delayed  coaalt- 
aent. 

In  1972,  Aunann  and  Masehler  f  7]  pointed  out  that  the  dlffi- 
cultlee  due  to  the  time  gap  exist  even  If  the  players  are  playing  a 
two-person  zero-sua  game  vlth  iaperfect  information  and  Ho  (21] 
extended  their  results  to  stochastic  tvo-person  gases. 

In  this  chapter,  we  will  define  the  differential  gaae  problem 
in  which  the  two  opposing  players  have  access  only  to  noise-corrupted 
output  seasureaents  and  Introduce  the  delayed  oosaltsent  strategies 
via  a  simple  example  of  a  one-stage  stochastic  difference  gaae. 

4.1  GAMES  WITH  IMPERFECT  STATE  INFORMATION 

As  pointed  out  in  Chapter  2,  if  the  output  aeasureaents  are 
corrupted  by  a  random  process  we  are  faced  with  a  stochastic  problem. 
In  order  for  a  stochastic  gaae  to  be  mathematically  tractable,  the 
measurement  noise  must  be  descrlbable  by  a  finite  set  of  sufficient 
statistics.  In  practice  this  means  a  linear  systea  with  quadratic 
cost  and  Gaussian  nolsss  corrupting  the  output  aeasureaents.  The 
sufficient  statistics  are  then  the  seen  and  covariance  of  the  process. 

Consider  the  linear  systea  described  by  the  vector  differential 
equation 

i(t)  -  jj-  -  F(t)x(t)  -  Gl(t)u^(t)  ♦  C2(t)u2(t)  (4.1) 

91 


f 


to  which  player  1,  controlling  u^(t),  has  available  measurements  of 
the  form 

•jCt)  -  HjftJxCt)  +ml(t),  (4.2) 

while  pl*yer  2,  controlling  u^Ct)  has  available  the  measurements 

«2(t>  ■  HjCtJxCt)  +w2(t)  (4.3) 

The  vector  s(t)cgD  Is  the  system  state,  u^(t)  €  I**  and  u2(t)«  EP2  are 
the  control  vectors,  s^(t)C  ^*1  and  *2(t)<  E“2  are  the  measurement 
vectors.  The  matrices  F(t),  CL(t)  and  62(t)  have  the  appropriate 
dimensions,  while  the  matrices  H^t)  and  ^(t)  are  respectively, 
m^  x  n  and  m2  s  n  with  m^ ,  m2  <  n.  The  noise  processes  |w^(t)|  and 
|w2(t)|  are  white  Gaussian,  with  properties 

cov  |  Wj  (t) ,  tj(t)  -  ¥l(t)«(t-T) 

cov  |w2(t),  w2(r)  -  W2(t)«(t-T)  (4.5) 

.  cov  |wl(t),  *2(T)|  "  0 

The  Initial  state  x(t  )  la  a  Gaussian  random  vector,  uncorrelated  for 

o 

all  t  with  vL(t)  and  *2(t),  end  having  a  mean  of  xp  and  a  covariance 

to,  |»(t0>.  «(t0)  |-  F.  <«•*) 

The  cost  functional  or  payoff  to  the  game  Is  quadratic: 

t  T  <*•’> 

tt2*  *  ^i|*T(T)*CT)  ♦  f  ulI(t)ul(t)dt  -  /  u2T(t)u2(t)dt 


52 


where  the  final  tiaa  T  U  fixed  and  the  expectation  la  taken  ovar 
all  tha  underlying  random  quantltlas  (x(t0),  (t) ,  w2(t)).  Tha 

simplified  fora  of  tha  payoff  functional  baa  baan  assumed,  which  can 
be  obtained  from  a  acre  general  foraulatlon  ualng  tha  transformation 
equation*  0*4)  of  Chapter  3. 

Let  ua  now  turn  to  tha  adalssible  strategic*.  Let  Z^Ct), 

1  -  1,  2  be  tha  output  function  aeasured  by  player  1  ovar  the  Interval 

Upi  Oi  l«a« » 

2i(t)  "  |  («!<•).•)  :  •€  [to,t)|  (4.8) 

the  class  of  admissible  strategies  are  then  restricted  to  those 
and  U2  which  give  rise  to  tha  feedback  control  laws 

Dl  :  “l  " 

U2  :  U2  "  U2 *Z2 <*•*> 

Thus,  tha  adalsslbla  strategies  can  only  depend  on  the  past  accumu¬ 
lative  Observation  data.  Equation  (4.9)  can  be  expressed  equivalently 
for  l  -  l,  2  as 

«t(t)  -$t(t,st^);  st^$  *1(a),*€  jto,t  J  J  (4.10) 

where  $  (•,•)!■  viewed  as  a  napping  fro*  I  xCR  J  tQ,T  J  — ■ •  Rp 
and  CB  [t0»^J  class  of  continuous  functions  defined  on  |  t&,T J 

with  values  In  1*. 


The  mapping  f  (t,  •)  for  1  •  1,  2  satisfies  s  Llpsehlts  eon 


dltloa: 

-J(t.g)  f»*«Ci||to,T]  (A. 11) 

for  all  t  C  [c0»t|  vh* rs  a  Is  some  constant.  The  Llpsehlts  condition  Is 
Imposed  for  taehnleal  reasons;  It  gives  a  sufficient  condition  for  the 
existence  of  (xCO.s^Ct)),  (x(t),*2(t))  In  (4.1)  through  (4.3). 

When  each  controller  Is  allowed  either  perfect  measurements  or 
noise -corrupted  measurements  ,  a  total  of  four  problems  may  be  formu¬ 
lated,  of  vhlch,  due  to  symmetry,  three  are  basically  different. 

Figure  4.1  shows  the  problem  classification  and  Indicates  those  discussed 
in  this  paper,  together  with  soma  references  to  previous  papers  vhlch 
examined  solutions  to  those  problems. 


'fc's*^^Flayer  2 

Perfect 

Holey 

Playerl"^-^^^ 

Measurements 

Measurements 

Closed- loop 

Chapters  5,  6 

Perfect 

Game 

Measurements 

Chapter  3 
(14,  15 1 

19,  10] 

Holey 

Measurements 

..  _ 

Chapter  7 
(8,  17,  22) 

Figure  4.1  Problem  Classification 


54 


which  player  1  attaapts  to  alnlalsa  and  player  2  to  aaxtalee.  Player  1 
receive a  no  aeaeureaente,  while  player  2  le  given  the  weaeureaenta 


where  x  and  w^  arc  Independent.  The  class  of  adalsslbla  strategies 
for  player  1  la 

■  kj  ■  constant  (4*15) 

end  for  player  2  Is 

U2  :  u2  "  k2*2  *  (4.16) 

He  can  obtain  the  prior  coomltment  strategy  by  substituting  (4.12)  Into 
(4.13)  which  gives 

J(ultu2)  ■  2  E  |2ui2  -  u22  +  2xul  ’  2xu2  "  2uiu2  i  (4.17) 


56 


(4.20) 


(4.21) 


Thua,  Uj*t  u2*  fora  a  aaddla  point  pair  and 

O’ 2 

J(ul*,u2#)  -  2(ff  +  l) 

and  It  haa  bean  assumed  that  u^*f  u^*  form  the  aolutlon  to  the  differen¬ 
tial  gaae. 

However,  eonalder  the  situation  facing  player  2  during  the 
actual  play  of  the  gave,  after  he  has  received  the  Information  and 
before  anyone  has  acted.  Player  2  now  faces  the  payoff 

J2(uru2)  "  2  B  I  2ul2  '  U22  +  2xul  "  2xu2  ‘  2u1u2  I  *2  I  (4>22) 

and  the  secure  strategy  of  this  maximising  player  Is  obtained  by 
finding  the  aaxlaln  solution  of  Equation  (4.22)  subject  to  equations 
(4.12),  (4.15)  and  (4.16). 

For  arbitrary  u2  the  minimizing  strategy  u^  obtained  from  the 
partial  derivative  of  J2  with  respect  to  Uj  Is 

U1  "  2  *u2  "  *2*  (4.23) 

where 

*2  ■  E  j  x  |  *2  |  (4.24) 

Substituting  this  result  Into  Equation  (4.22)  gives 

*  2  1  I'  2  UJJ  ’  “2  '  *  2  {22  I  *2!  <4-25) 


57 


and  the  maximizing  u  la  found  to  bo 


*  1  A  1  g 

U2  “  "  3  *2  “  "  3  *  +  1  *2 


and  thus 


*  2  a  2  a 

u2  T  '  3  *2  “  *  3  g  +  1  *2 


(4.26) 


(4.27) 


The  resulting  maxim in  solution  is  thus 


* 


u 


l 


* 


u 


2 


(4.28) 

(4.29) 


Since  Kg  can  regarded  as  part  of  the  prior  information  and  thus  is 
a  known  number,  u^  and  u2  satisfy  the  restriction  on  the  class  of 
admissible  strategies. 

An  analogous  argument  shows  that  the  mlnlmax  strategies  are  the 

*  * 

same  as  the  max lml n  strategies.  Hence  u^  and  u2  are  not  Just  the 
naxlmin  solution,  but  they  are  a  saddle  point  pair  for  J2,  l.e.. 


'2'  1 


2  1*2 


(4.30) 


and  the  resulting  payoff  is 


J2<ul‘*u2*> 


I 

m 

6 


g  2 


(<7+  D 


2  *2 


(4.31) 


58 


On  the  other  hand,  If  player  1  usee  strategy  and  player  2 


uses  u2\  then 


h  •  13  ^  ^  2 

J2(U1  »u2  >  "  "  72 - 

(ff  +  1) 


2  *2 


(4.32) 


Obviously,  J2^ui  ,u2  ^  ^2^Ui*,u2  ^  *nd  we  conc^u,,e  ***at  *or  *11 
possible  values  of  the  observation  s2,  the  strategy  u2*  Is  actually 
a  safer  strategy  for  player  2  than  u2°. 

The  reason  for  this  phenomena  as  first  pointed  out  by  Harsanyl 
(19  ]  and  then  by  Auaann  and  Machler  [7  ]  Is  Inherent  In  the  Normaliza¬ 
tion  Principle  of  game  theory.  In  the  extensive  form  of  the  game  a 
player  stakes  his  decision  as  to  what  control  to  use  after  receiving 
his  measurements,  while  In  the  normal  form  of  the  game,  this  decision 
Is  effectively  moved  to  before  receiving  those  measurements. 

In  many  games,  the  passage  from  the  extensive  to  the  normal 
form  does  not  affect  the  course  of  action  of  the  players  and  the  two 
situations  are  formally  equivalent.  But,  In  our  gar-9,  with  Imperfect 
Information  this  passage  changes  the  outlook  of  player  2.  Indeed, 

If  player  2  decides  on  a  strategy  before  receiving  the  meesurement 
*2,  he  Is  justified  In  using  the  expected  value  of  s2  In  his  psyoff 
function.  However,  when  player  2  Is  Informed,  before  making  his 
decision,  that  e  specific  s2  has  been  selected,  there  Is  no  longer 
any  Justification  for  using  the  expected  value  of  z2.  Thus,  after 
the  Information  Is  received,  we  really  have  a  non  sero-sum  game 
facing  the  two  players,  with  (4.17)  the  payoff  for  pleysr  l  and  (4.22) 
the  payoff  for  pleyer  2.  It  Is  this  change  In  outlook  that  Is 


59 


Ignored  In  the  puiigi  from  the  extensive  to  the  normal  fora  of  the 
!«*• 

In  terms  of  Harsanyl's  diecueelon  the  players  "enter”  the  game 
efter  the  "chence"  (the  measurement  noise)  has  made  Its  choice. 

During  the  play  of  a  stochastic  differential  (difference)  game  at  time 
t  or  greater  than  tQ,  the  players  effectively  also  "enter"  the 
game  having  received  the  actual  measurement  (noise  corrupted)  up  to 
that  time. 

Returning  to  our  example,  If- player  2  has  reason  to  believe 
that  player  1  Is  committed  to  the  strategy  u^*  -  0,  then  on  solving 
the  resulting  one  sided  optimal  control  problem  from  player  2's  point 
of  view, 

“*  J2(U1**U2*  "  J2<°*u2*  "  2  K  |"  u22  “  Zxu2  I  *2  I  (4*33> 

U2 

gives 

*»*•'  *  I  *  I  *a  I  “  *  *71*2  <4-34) 

Similarly,  if  player  2  Is  committed  to  u^*  *  -  —  the  solution 

to  the  resulting  one  sided  optimal  control  problem  from  player  l's 
point  of  view  is 

■In  J.(u.,u  *)  -  £  E  I  u  2 - 1L—  *22  +  2  *°l 

ui  2  1  (<r+ 1  r 

*  2  l  **2  +  2  '9  +  i  “1*2) 


(Cont *d) 


(4.35) 


which  gives 


Thus,  the  strategy  pair  ju^*,  u2*|  Is  a  Nash  equilibrium  solution  to 
the  non  aero  -  sum  game.  Hence  If  player  2  knows  a  priori  that 
player  1  will  use  u^*,  or  If  he  can  convince  player  l  that  he  Is 
using  u2#,  than  his  optimal  strategy  will  be  u2*.  However,  It  Is 
well  known  that  Mash  equilibrium  strategies  do  not  possess  any  mlnlmax 
or  guaranteed  value  properties  In  non  zero  -  sum  games,  end  without 
this  e  priori  knowledge  there  Is  no  reason  at  all  to  play  u2*  when 
u2*  Is  safer  and  aval lab la. 


41 


CHAPTER  5 


THE  PERFECT/NOISY  DIFFERENTIAL  GAME 

In  this  chapter  ve  dlacuaa  the  caae  where  one  of  the  playera 
haa  perfect  atate  Information  while  the  other  player  haa  only  nolay 
aeaaureaenta  of  the  atate.  A  phyalcal  example  of  auch  a  problem  would 
be  the  purault-evaaion  problem  of  a  homing  mlealle  and  an  evading  air* 
craft  where  the  mlealle  haa  conalderable  ground  aupport  via  an  up-  and 
downlink  to  determine  the  atate  of  the. evader. 

The  problem  la  baa leal ly  the  aame  aa  that  aolved  by  Behn  and 
Ho  [9]  aa  a  purault-evaaion  differential  game  and  extended  by  Rhodea 
and  Luenberger  flOj to  more  general  differential  gamee.  Their  aolutlona, 
however,  are  prior  commitment  aolutlona  and  aaaume  that  condltlona  are 
auch  that  the  player  with  perfect  atate  Information  can  deduce  exactly, 
at  each  time  t,  the  error  In  hla  opponent' a  atate  eetlmate. 

In  Section  1,  of  thla  chapter,  we  formulate  the  atochaatlc  pro¬ 
blem  and  dlacuaa  the  prior  coaeoltment  aolutlon.  The  delayed  coomltment 
atrateglea  for  player  1  and  player  2  are  then  obtained  In  Section  2 
ualng  function  apace  technlquaa.  It  la  then  ahewn  that  the  reaulta 
can  be  Interpreted  In  terma  of  matrix  differential  aquatlona  of  the 
Rlccatl  type. 

The  deleyed  commitment  atrategy  optimality  criteria  are  dla- 
cuaaed  In  Section  3  and  compared  with  thoae  of  the  prior  commitment 
atrategy.  The  chapter  la  then  concluded  with  a  aummary  and  dlacuaa Ion 
•f  the  reaulta  obtained  for  the  perfect/nolae  corrupted  two-peraon 
differential  game.  . 


62 


5.1  PROBLEM  FORMULATION  AMD  PRIOR  COMMITMEKT  SOLUTION 

The  problem  formulation  differs  only  slightly  from  that  pre¬ 
sented  In  Chapter  4,  In  that  only  one  player  has  noise  corrupted 
measurements  of  the  stats  vector,  x(t),  during  the  game  and  an  estimate 
of  the  Initial  condition,  vhlle  the  other  player  has  perfect  state 
information  during  the  entire  gaae.  Thus,  the  linear  continuous  time 
dynamic  system  Is  described  by  the  vector  differential  equation 

i  -  ~  -  F(t)*(t)  -  Cj  (t)Uj  (t)  +  C2(t)u2(t)  (5.1) 


and  the  quadratic  cost  functional  is 


T  (5.2) 

JfojfUj)  -  jl  |*T(T)x(T)  +  J  |u1T(t)u1(t)  -  u2T(t)u2(t)J  dt 

*  t 

o 


where  the  dimensions  for  the  vectors  and  matrices  are  as  discussed  In 
Section  4.1  and  the  final  time  T  Is  fixed. 

Player  1  has  perfect  measurements  of  the  state  x(t),  vhlle  the 
measurements  available  to  player  2  are  of  the  fora 


*2(t)  -  H2(t)x(t)  +»2(t)  (5.3) 

where  the  matrix  (t)  Is  >2  x  n  with  a2  £  n.  The  noise  v2(t)  Is 
assessed  white,  cero-mean  and  Gaussian  with  covariance 

cov  |  v2(t),v2(T)  |  -  W2(t)  6  (t  -T)  (5.4) 


63 


The  Initial  ititt  x (tQ)  for  playir  2  la  uiuaid  to  ba  a 

Gauaalaa  randan  vac tor  uncorralatad  with  v2(t)  for  all  tin*  t  €  |  to,T  J 

-and  having  a  naan  x  and  eovarlanea 

o 


cov  [a(t0).*(t0)]  -  Po 


(5.5) 


The  initial  atate  for  player  1  la  x(t  )  »  x  . 

o  o 


Let  Z2(t)  be  the  output  function  neaaured  by  player  2  over  the 


Interval  U0*t),  l.e.. 


Z2(t)  -  (x2 (■)»•)  :  •€  ft0,t)| 


(5.6) 


The  claaa  of  admlaalble  atrateglea  are  reatrlcted  to  thoae  and  U2 
which  give  rlae  to  feedback  control  lawa,  l.e.. 


W1  5  ui  "  “^(O.t) 

B2  5  u2  " 


(5.7) 


Let  the  beat  linear  eetlaate  of  the  ayatea  atate  x(t)  given  the 
neaaured  output  function  Z2(t)  be  denoted  x2(t),  l.e., 


$2(t)  |x(t)  |  Z2(t) 


(5.8) 


The  correapondlng  eatlnatlon  error  x2(t)  then 


x,(t)  ^  x(t)  -  x,(t) 


(5.9) 


Since  the  random  variables  are  normally  distributed,  the  best  linear 
estimate  will  also  be  the  overall  optimal  estimate. 

Previous  prior  commitment  solutions  require  that  conditions  are 
such  that  the  player  with  perfect  state  Information  (player  1)  can 
deduca  exactly  at  each  time  t e  |  tQ,T|  the  error  In  his  opponent's 
state  estimate,  x2(t),  or  that  this  Information  Is  provided  by  some 
'bystlcal  third  party." 

In  the  more  general  case,  where  player  l  cannot  calculate  nor 
Is  provided  with  ?2(t),  or  equivalently  x2(t)  from  which  x2(t)  ■  x(t) 

-  x2(t),  he  will  have  to  build  a  filter  from  which  he  generates  an 
estimate  of  his  opponent's  estimate,  denoted  x2^(t).  Obviously, 
x21(t),  based  on  noisy  data,  will  deviate  from  x2(t)  and  player  2 
should  be  able  to  take  advantage  of  this  error  In  player  l's  estimate 
of  the  estimate  of  player  2,  leading  effectively  to  an  additional 
term  In  player  2's  control.  However,  such  an  additional  correction 
term  Is  based  on  noisy  data  and  the  opponent,  player  1,  should  be 
able  to  take  advantage  of  this  error.  However,  the  correction  of 
player  1,  In  turn.  Is  based  on  noisy  data  and  player  2  should  be 
able . 

What  we  have  just  encountered,  If  the  general  problem  Is 
solved  from  the  prior  commitment  point  of  view,  Is  known  as  the 
closure  problem  In  stochastic  games.  It  expresses  the  fact  that  an 
Infinite  number  of  terms  ssem  to  be  required  In  the  optimal 
strategies  of  each  of  the  two  players. 

For  the  differential  game  defined  by  Equations  (5.1)  through 
(5.7),  and  under  the  assumption  that  player  1  can  determine  exactly 


65 


the  error  in  player  2'e  estimate,  x2(t),  the  prior  commitment  optimal 

* 

strategic*  obtained  by  Behn  and  Ho  and  Rhodes  and  Luenberger  are 


u{(t)  -  GxT  [s  (t)x(t)  +  N(t)x(t)J 
u^(t)  -  C2T(t)S<t)£(t> 


(5.10) 


where  the  symmetric  gain  matrix  S (t)  satisfies  the  matrix  Rlccati 
equation 

S  -  -SF(t)  -  FT(t)S  +  S  [c1(t)G1T(t)  -  G2(t)G2T(t)j  S  (5.11) 

with  boundary  condition 

S(T)  -  I  (5.12) 


and  the  symmetric  gain  matrix  H(t)  satisfies  the  differential  equation 

N  -  NF(t)  -  FT(t)N  -  S  |ci(t)G1T(t)  -  G2(t)G2T(t)j  S 

+  (S  +  N)Gl(t)GlT(t)(S  +  N)  +  (OHjft)  (5.13) 

+  H2T(t)W2’l(t)H2(t)P(t)N 
with  boundary  condition 

N(T)  -  0  (5. 14) 

The  symmetric  error  covariance  matrix  P(t)  satisfies 


*  In  order  to  avoid  confusion  with  the  state  estimate  in  the  delayed 
comeltsMnt  solution  discussed  below,  the  subscripts  have  been 
omitted  from  the  state  estimates  and  their  errors  in  the  prior 
commitment  game. 


66 


(5.15) 


F  -  A(t)P  ♦  PAT(t>  -  FH2T(t)W2‘l(t)H2(t)F 


vlth  boundary  condition 


?<to)  -  cov  [x(to),*(to)|  -  Pe 


(5.16) 


where  the  aetrlx  A(t)  la  defined  by 

A(t)  -  P(t)  -  C1(t)C1I(t)  |s(t)  +  N(t)J  (5.17) 

Hote  that  Equations  (5.15)  and  (5.13)  are  coupled,  so  that  the 
solution  of  this  problem  involves  a  nonlinear  two-point  boundary  value 
problem  given  by  these  aquations  with  boundary  conditions  (5.14)  and 
(5.16).  The  solutions  of  the  matrix  Rlceati  type  equations,  l.e.. 
Equations  (5.11)  through  (5.17)  can  be  obtained  off-line,  prior  to  the 
actual  game. 

The  corresponding  optimal  expected  cost  from  time  t  is 
J(«;.«j>  m2  *T<t>S(t)x(t)  +{iT(t)H(t)*(t) 

+  J  tr(  /  E(s)P(s)H2T(s)W2'1(s)H2(s)P(s)ds  |  (5.18) 


where  tr  |*|  Is  the  trace  operator. 

5.2  DELATED  COMMITMENT  STRATEGIES 

During  the  actual  play  of  the  game  at  time  t,  and  from  the 
point  of  view  of  player  1,  the  payoff  functional  becomes 


67 


T  (5.19) 

Jl<ul»u2>-2  1  *T(I)*CT)  ♦  /  (o/WujlT)  -  u2T(t>u2<T)]  dT^X(t> 

t 


vhtri 


X(t)  -  |(x(s),  »)  :  s  fto,t)  J  (5.20) 


and  while,  pointed  out  In  Chepter  4,  the  strategy  pair  (uJ*up 
preserved  In  Section  5.1  still  retains  its  equlllbrluD  property,  they 
are  no  longer  secure  strategies.  In  order  for  player  1  to  detemlne 
his  secure  strategy,  he  has  to  find  the  saddle-point  solution  to 
Equation  (5.19)  subject  to 


x  -  F(t)x (t)  -  C^OUjft)  ♦  C2(t)u2(t);  «(to)  •  xo  (5.21) 


Flayer  2  is  faced  with  the  problea  of  extracting  useful  lnfor- 
aation  froa  his  past  aeasureaants  on  which  to  base  his  control.  How¬ 
ever,  player  2's  perfect  estlaate  is  x2(t)  ■  x(t)  and  for  the  purpose 
of  calculating  player  l's  secure  strategy  we  assuae  that  the  allowable 
strategy  for  player  2,  In  addition  to  being  Z2(t)  aeasurable,  la  also 
X(t)  aeasurable.  Xn  other  words,  we  want  to  dateralne  that  u^*c 
and  u2*c  Uj  x  U2  vhleh  are  optimal  In  the  sense  that  for  all  tc  |c0»Tj 

Ji*ui*,u2*  -  Jl*ul*»u2#*  $  Jl(ul,u2*)  (5.23) 


48 


where 


51  *  ttl  "  «iC«<0.t) 

h  :  u2  "  *2^2^**) 


(5.24) 


The  delayed . commitment  gene  from  player  l'e  point  of  view  le 


then  the 


aa  that  eolved  In  Section  (3.2)  for  which  we  obtained  the 


saddle-point  aolutlon 


V<t)  *  ClT(t)S(t)*(t) 


(5.25) 


with  the  correepondlng  optimal  response  for  player  2 


u2’(t)  -  C2T(t)S(t)x(t) 


(5.26) 


where  S(t)  la  the  solution  to  Equation  (5.11),l.e., 

t 

S  -  -  SF(t)  -  I*(t)S  ♦  S  [ci(t)G1T(t)  -  G2(t)C2T(t)  |  S;  S(r)  -  1 

The  resulting  aacurlty  payoff  for  player  1,  l.e.,  his  loss 
celling  at  arbitrary  time  t  is  from  Equation  (3.90) 


(5.27) 


Jl<ttl*»tt2#)  *  f  *(T)S(t)*(t) 


(5.28) 


Vote  that  In  real  life  when  player  2  does  not  have  a  perfect  estimate  of 
the  state  x(t),  the  payoff  to  player  1  can  only  be  better,  l.e.,  smaller 
than  his  loss  celling. 

If  we  now  consider  the  game  from  the  point  of  view  of  player  2, 

his  payoff  during  the  actual  play  of  the  game  at  time  t  le 

T  (5.29) 

J2*VU2*  "2  1  *T  <*>»(*>  ♦  /  |u|T (1)^(9)  -  u2T(t)u2(t)J  dr|z2(t) 

t 


69 


and  his  securs  strategy  Is  obtained  by  finding  the  saddle-point 
solution  to  Equation  (5.29)  subject  to 

a  -  F(t)a(t)  -  C1(t)ul(t)  ♦  C2(t)u2(t)  ;  a(tQ)  •  (5.30) 

For  the  purpose  of  determining  the  secure  strategy  of  player  2, 
ve  assume  that  the  allowable  strategy  for  player  1  In  addition  to 
being  X(t)  measurable  Is  also  £2(t)  measurable.  Thus,  player  2  wants 
to  determine  that  Uj*<  a  U2  and  u2*e  U2  which  are  optimal  In  the 
sense  that  for  all  tc  |tQ,T  j  . 

J2(«i".u2^  1  J2 (ui* »u2*^  -  J2^U1,U2*^  (5.31) 

where  and  U2  are  defined  by  Equation  (5.24). 

In  terms  of  the  Hilbert  space  notation  developed  In  Chapter  3 
the  payoff  functional  becomes 

J2(ui»u2^  “  "2  ®  ^^*0  ”  ^1U1  *2U2’  ^*0  ”  ^1U1  ^  ^2°Z^ 

+  <  ui»ui>  “  <u2»u2^  I  Z2^|  (5.32) 

which  Includes  the  dynamic  Equation  (5.30),  since  It  was  used  to 
develop  the  above  payoff  functional. 

Thus,  from  plsysr  2's  point  of  view  of  a  secure  strategy, 
player  1  minimises  at  arbitrary  thee  t  •  tQ 


*ln  •  I  <#*  "  Tl“l  +  T2U5»*X«  *  T1U1  ♦  T9U9> 

U1€01  *  U2  *  I  11  22  11  22 

+  <ul*ul>  '  <u2*u2>  ( 

From  Section  3.3  we  know  that  the  globally  minimizing  control  of 
pleyer  1  la 


(5.33) 


U1  “  T1  Dl**Xo  +  *2U2^ 


(5.34) 


where 


D.  -  (I  +  T.T.  ) 


*  -1 


(5.35) 


Substituting  Equation  (5.34)  Into  (5.32)  glvea 


J2 ^u2^  "  2  E  <*Xo  “  T1T1*D1  Xo+T2u2>  +  T2U2*  **0  *  TlTi*Dl 


(**o  +  t2u2>  +  T2u2>  <ti  »,  (♦*„  ♦  V*9>» 


1  'l'7  o  '  22 


Tl\(#xe  +  t2u2)>  *  <u2»u2>  I  Z2(t) 


(5.36) 


which  alapllflea  after  a  one  algebra  to 


J2<u2>"2  E  <#Xo  +  T2U2’Di(*Xo  +  T2u2)>  '  <Vtt2>  I  Z2(t)  <5*37> 


Let  ua  define. 


F2(t)  *1  j  |*(t)  -  i2( t)|  |*(t)  -  *2(t)|  T  |Z2(t)|(5.38) 


71 


and  consider  the  tars 


K  J<*x,D1#x>|Z2(t)J  -  E  |*(x  *  *2  +  *2**D1  ***  ‘  ^2  +  ^2^  lZ2(t* 
-  B  j<*(x  -  KjJ.Dj  #(*  -  ft2)>  |Z2 (t)J  +  2E  j  <♦(*  -  $2), 

*l**2>  iZ2(t)J  +  E  |<*  VD1#^2>  ,Z2(t) 


But 


E  <#(x  -  x2),Dl#Jt2>  |Z2(t) 


(5.39) 


(5.40) 

-<*ft2,D1*^2>  -  <#ft2.D1#ft2> -0 


and 


<  #(x  -  ^2),D1  ♦  (x  -  «2)>  |Z2(t)  -  E  <*TD1  *(x  -  $2), 


(x  -  $,)>  |  Z.  (t) 


tr 


l*Vr2| 


(5.41) 


where  tr  (  *  ]  la  the  trace  operator. 

Thus,  in  general,  the  payoff  functional  can  be  written  aa 


W  "  2 


<#$2  +  T2U2»Dl(t^2  +  T2U2*^  "  ^U2'U2^ 


+  «l*Vp2| 


(5.42) 


and  Equation  (5.37)  becones 


J2(u2>  “  2  |<**20  +T2U2,D1(**2  +  I2U2)  >  '  <U2,U2> 

*T»1*?20]  j  (5’45) 


«■  tr 


72 


where 


42„  '  1  |«<'.>lVt>|  ’ 

?2C  “  eov  |*(to)#  *(to)|  “  *, 

Since  tr  |#TD^#P2  |  le 
maximising  (u^)  le  equivalent  to  maximising 

J2(u2)  “  2  1^  +  T2U2’D1^*^2  +  ‘  <  u2,u2^ 


(5.44) 


Independent  of  the  control  Uj (t) , 


(5.45) 


Froa  the  results  of  Sections  3.3  end  3.4,  we  know  that  the  resulting 
maximln  control  for  player  2  is 


*  *  *  I  *  *  I  -1  A 

»2  *  T2  0**2  *T2  I  +  T1T1  -V2  *42 

ft  a 


(5.46) 


or 


*2*(t)  -  C2T(t)S(t)x2  (t) 

o 


where  S(t)  la  again  the  solution  to  Equation  (5.11),  l.e.. 


(5.47) 


(5.48) 


S  -  -  SF(t)  -  FT(t)S  +  S  1 6& (06^(8)  -  G2(t)G2T(t)]  S;  S(T)  -  I 


The  corresponding  optimal  response  for  player  1  Is  from  Equations 
(5.34)  and  (5.47). 

u*  -  Xl*Dl(#xo  +  T2T2*D#&2  )  (5.49) 

o 

However,  the  Initial  time  tQ  Is  completely  arbitrary,  thus  If 
$2(t)  can  be  made  available  for  any  t,  the  open- loop  controls 
(Equations  (5.46)  and  (5.49)  can  be  applied  continuously  and 


73 


Immediately  to  yield  optimal  feedback  control  lave  by  replacing 
t0  by  t. 

Substituting  Equations  (5.46)  and  (5.49)  Into  (5.30),  the 
dynamic  system  for  arbitrary  t  Is 

(5.50) 

i(t)  -  |y(t)  -  Cl(t)(T1*D1*)(t)  )  *(t)  +  |c2(t) 

-Gt  (t)  (T^DjTj)  (t)  ]  [(  T2*D  ♦  )  (t)  |  *2  (t)  ;  x  (tQ)  -  xq 

and 

*2(t)  -  »2(t)x(t)  +w2(t)  (5.51) 

The  1 Inear-Gauss lan  assumptions  imply  that  &2(t)  can  be  generated  by 
a  Kalman-Bucy  filter  based  on  a  prior  estimate  of  the  initial  state, 
a  prior  estimate  of  the  variance  of  the  error  of  this  estimate;  the 
measurements  of  the  state  up  to  time  t;  and  the  dynamic  equation 
• 

x2(t)  -  |p(t)  -  Gl(t)(T1*D#)(t)  +  G2(t)(T2*D*)(t)|  $2(t) 

+  P2(t)H2T(t)V2"l(t)  |*2(t)  -  H2(t)$2(t)]  ; 

W  *  *o  (5*52) 

where  ?2(t)  is  the  variance  of  the  error  of  player  2's  estimate  and  is 
obtained  from 

P2(t)  -  |p(t)  -  Gj  (t)  (Tj^Dj •  )  (t)  ]  P2(t)  ♦  P2(t) 

|?(t)  -  CjWd^DjfJU)  J  T  -  P2(t)H2T(t)W2‘l(t)H2(t)P2(t); 

P(t  )  -  f  (5.53) 

o  o 


74 


Hence,  the  closed-loop  optimal  controls  for  player  2  and  the  corres¬ 
ponding  optimal  closed- loop  strategy  for  player  1  are: 


★  ♦  _  A 

«2  -t2d*$2 


u*  -  T1*D1# *  +  Tl*DlT2T2*D#«2  -  T^D#^  +  (5.54) 


If  we  define  the  symoetrlx  matrix 


where 


(t)  A*x(t  ,t)D1(t)  ♦(!.!) 


^(t)  -  |l  +  /  ♦(T,T)G1(t)G1T(T)  #T(T,T)dT 


-1 


(5.55) 


(5.56) 


then  taking  the  derivative  of  M^(t)  with  respect  to  t  ve  obtain 

(5.57) 

^(t)-  iTcr,t)D1(t)  *(r.t)  +#Tcr»t)Dl(t)  t(T»t)  +*Ta>t)D1(t)*  cr.o 
But 

d  d  -1  (5.5S) 

-^-^(t)  -  -  D^t)  -^-Dl  l(t)D(t) 

-  DL(t)  #(T,t)Cl(t)GlT(t)  #T(T,t)D1(t) 


thus 

H^t)  -  -  PT(t)  ♦X(T,t)Dl(t)  #a,t)  +  #T(T,t)Dl(t)*(T,t)G1(t)GlT(t) 

♦T(I,t)D1(t)  *  (T,t)  -  ♦T(T,t)Dl(t)  f(T,t)P(t)  (5.59) 


or 

■  j(t)  -  -  Hj(t)F(t)  -  lT(t)l1(e)  ♦  Hl(t)Gl(t)GlXHl(t)  (5.60) 


75 


with  boundary  condition 


«lCT)-I  (5.61) 

The  optimal  control  for  player  2  and  corresponding  optimal 
response  for  player  1  can  then  be  written  as 

«2*(t)  -  G2T<t)S(t)$2(t)  (5.62) 

»>)  -  G1I(t)S(t)ft2(t)  +  0^(01^  (t)S?2(t)  (5.63) 

where  S(t)  and  Hj  (t)  are  defined  by  Equations  (5.11)  and(5.60) 
respectively. 

Furthermore,  using  Equations  (5.11)  and  5.60),  the  Kalman- 
Bucy  filter  (Equation  (5.52))  and  corresponding  covariance  equation 
f5.53)  can  be  written  as 

(5.64) 

$2(t)  -  |F(t)  -  G1(t)C1T(t)S(t)  +  G2(t)G2T(t)S(t)|  ft2(t) 

+  P2(t)H2T(t)tf2’l(t)  |  *2(t)  -  H2(t)ft2(t)|  ;  $2(to)  -  xQ 

and 

P2(t)  -  |p(t)  -  Cl(t)GlT(t)Hl(t)|  P2(t)  +  P2(t)  |p(t)-C1(t)GlT(t) 

T 

Ml(t)  |  ’  Vt)H2T<t)W2~1(t>H2(t)P2(t);  "  Po  (5  65) 

If  we  define 

i 

X2(t)  $E  |*(t)K(t)T  (Z2 (t)  I 

V  ^ 

-  B  |  |  ft2(t)  -  S2(t)|  [  ft2(t)  -  S2(t)  j  I  z2  (t)| 

-$2(t)  ♦  ?2(t)  (5.66) 


76 


then  on  substituting  the  optimal  strategies  (Equations  (5.62)  and 

(5.63))  Into  the  system  equation  (Equation  (5.30))  we  can  write,  after 

T 

post-multiplying  by  x  (t),  adding  the  transpose  of  the  resulting 
equation  and  then  taking  the  conditional  mean  of  the  resulting  expres¬ 
sion, 

X2  "  ra2  +  *2pT  ’  G1G1TsX2  *  G1G1Tk1P2  +  G2G2TsX2 

-  X2SGlG1T  -  PjN^gJ  +  ft2SG2G2T  (5.67) 

Substitution  of  the  optimal  strategies  Into  the  payoff  functional 

(equation  (5.29))  and  using  the  trace  operator  allows  us  to  write 

T  (5.68) 

J2  <ui*’u2*>  *  2  tr  *2  (T)  +  /  (G1G1Ts^2S  +  G1G1T,,1P2N1  "  G2G2Ts^2S  ]  dt 

t 

If  we  now  add  the  perfect  differentials  (Sty  *~diT  ^1P2^ 

and  (-  SP)  Into  the  Integrand  of  Equations  (5.68)  and  compensate 

by  adding  S(t)X2(t)  -  SCO^tf)  4  (^(t)  -  S (t) ]  P2(t)  -  [n^T)-  S(T)| 
P2 (T)  outside  the  Integral,  most  of  the  terms  cancel  and  we  obtain  as 
the  security  payoff  or  xaln  floor  for  player  2 

(5.69) 

J2(ulV)’2|I(t)S(t)x(t)+2?2(t)  [Nl(t)  '  s (t)  ]  «2(0 

T 

+  2  tr  /  [Hl(i)  “  8<»>]  P2(»)H2T(s)W2"l(s)H2(s)P2(a)ds 
•  t 

The  entire  game  from  player  2's  point  of  view  can  be  described 
by  a  2n-dlmenslonal  system  consisting  of  the  vectors  x(t)  and  x(t)  or 
•lsillarly  of  the  vectors  x(t)  and  x(t).  The  |x(t),£(t)j  system  Is 
obtained  by 


77 


substituting  Equations  (5.62)  and  (5.63)  Into  tha  s 7s tea  Equation 
(5.30)  to  give 

x  -  F(t)*(t)  -  Gl(t)ClT(t)8(t)ft2(t)  -  e1(t)61T(t)M1(t)V2(t) 

♦  G2(t)G2T(t)S(t)$2(t) 

-  |p(t)  -  ^(OC^Ct^Ct)]  *(t)  +  [ci(t)GlT(t)H1(t) -G1(t)G1T(t)S(t) 
+  G2(t)G2T(t)S(t)]  $2(t)  (5.70) 


The  Input  to  this  aquation  Is  obtained  froa  Equation  (5.64)  or,  on 
substituting  *2(t)  ■  Hj  (t)x(t)  +  w2(t)»froa 

(5.71) 

S2-  |p(t)  -Gx  (t)G1T(t»(t)  +G2(t)G2T(t)S(t)  -P2(t)l2T(e)V2’1(e)H2(t)|x2 
♦  P2(t)H2T(t)W2‘1(t)H2(t)x(t)  +  P2(t)H2T(t)W2"1(t)H2(t)w2(t) 

Than,  froa  player  2's  point  of  view  tha  entire  play  of  the  gaae  can  be 
described  by  the  2n-dlaenslonal  differential  equation 


In  the  above  systea  the  white  noise  «2(t)  which  Is  additive  aeasureaent 
noise  to  player  2,  appears  as  process  noise  to  the  2n-dlaenslonel 


systea. 


2 


5.3  DISCUSS IOH 

The  prior  commitment  a  ad  delayed  commitment  solutions  to  the 
stochastic  differential  game  discussed  in  this  chapter  are  sunmarlsed 
in  Tables  5.1  and  5.2  respectively. 

In  the  prior  commitment  formulation,  the  optimal  control  for 
player  1  consists  of  the  sun  of  a  tens  that  is  the  saute  as  that  of  the 
corresponding  deterministic  differential  game  and  a  term  that  is  a 
linear  function  of  the  error  in  his  opponent's  state  estimate.  The 
optimal  control  for  player  2  satisfies  the  Separation  Theorem.  Deter¬ 
mine  Ion  of  the  feedback  gain  for  the  first  term  of  player  l  and  for 
player  2  requires  the  solution  of  a  simple  matrix  Rlccatl  equation 
with  terminal  boundary  conditions.  To  determine  the  feedback  gain  of 
the  second  term  of  player  l's  strategy,  however,  we  are  faced  with  the 
often  difficult  task  of  finding  the  solution  of  a  nonlinear  two  point 

e  e 

boundary  value  problem  defined  by  the  equations  for  N  and  P  in 
Table  5.1. 

In  the  case  of  the  delayed  commitment  formulation,  the  secure 
strategy  for  player  1  is  the  same  as  for  the  deterministic  game,  while 
the  secure  strategy  for  player  2  satisfies  the  Separation  Theorem. 
Determination  of  the  feedback  gains  involves  the  simple  solution  of 
matrix  tlccati  equations  with  all  the  boundary  conditions  for  each 
equation  given  at  one  point  in  time. 

The  secure  delayed  coomltment  payoff,  Jj,  for  player  1  is 
identical  to  that  obtained  in  Chapter  3  (Equation  (3.90))  for  the 
corresponding  deterministic  game.  The  difference  between  and  the 
prior  commitment  payoff,  J,  can  be  written  from  Equations  (5.28)  and 


79 


TabU  5.1  Suanary  of  tba  Prior  Comltaent  Strategies 


X  •  P(t)*(t)  -  Cl(t)ul(t)  +  C2(t>U2(t), 


-ft.)  ■<,o1?o) 


Player  1:  Parfaet  Measurements 

Player  2:  *2(t)  -  HgCOxCO  +»2(t),  i»2  *,H(0,w2) 


J  • 


CI)*(T)  +  /  [u^Ou^t)  -  u2T(t)u2(t)j  dt 


«{(t)  -  0^(08  (t)s(t)  ♦  6^(01  (t)3(t) 

U2(t)  “  C2I(t)S(t)*(t) 

S  -  -  SF(t)  -  8^(08  +  8  |  ^(06^(0  -  C2(t)C2T(t)J  8;  S(T)  -  I 

*  -  -  I?(t)  -  Ao*  -  8  |  Cl (00^(1)  -  G2(t>G2T(t>  js 

+  (8  4a)G1(t)61T(t)(l  +*)  ♦*P(t)H2T(t)W2~l(t)H2(t) 

+  l2T(t)«2"1(ft)l2(e)P(t)l;  M(T)  -  0 

ft  -  (F  -  GjG^S  +  G2G2TS)x  +  -  HjX);  ft(tQ)  -  xo 

P  -  AP  +  PAT  -  PH2T(t)W2'l(t)H2(t)P;  P(to>  -  ?o 

A(t)  -  F(t)  -  GjCOCj1  |  8(t)  ♦  *(t)j 

J(«[»  «J)  -  £  »T<08(0k(0  ♦  £*T(t)>(t)ff(t) 

r  T  1 


♦ 


1 

2 


tr 


/  »(a)P(a)H2I(a)M2"1(a)H2(a)P(a)da 


•0 


TabU  5.2  Stannry  of  tho  Dtlnjrad  Coanltaant  Strategies 

*  -  KO*<t>  -  G^Ou^t)  +  C2(t)u2(t),  *(to)~M0«o,  Po) 

Player  1:  Par fact  neasuranenta 

Playor  2:  *2(t)  -  HjCOxCt)  +  *2<t),  *2~II(0,¥2) 

T 

J  -  |»  xT(t)*Ct)  +  /  (u^COu^t)  -  u2T(t)u2(t)j  dt 

*o 

Define:  z2(t)  "  |  (*2  (•)»•)  ;  •clt0,t) 

D1  s  U1  “  u!  <*<*>, t) 

U2  !  U2  “  U2 <Z2 
t 

(Tu)(t)  -  f  #(t,T)G(r)u(T)dT 
*o 

CT*|)(t)  -  GT(t)#TCI.t)| 

-i-ii+viT-  °2  *  i 1  *  v**r‘ 

Jj,  •  2*  I  *TCT)* CT>  •  /  [u^tju^t)  -  u2T(t)i»2(t)  J  dt  j 


•1 


Table  5.2  (Contlnuad) 


uf  -  Tj*!)  •*  -  6^(08  (t)x(t) 

«2*  -  T2*D#x  -  G2T(t)S(t)x(t) 

8  -  -  SF(t)  -  I*(t)S  +  8  |G1(t)G1T(t)  -  G2(t)G2T(t)  |  S  ; 
S(T)  -  I 

Jl(ul*»tt2#)  -  7KT(t)S(t)x(t). 


Player  2 

|  * 

J2  -  |  l  xTa)*(T)  +  /  |ulT(t)ttl(t)  -  u2T(t)u2(t)J  dt|Z2(t) 
1  'o 

«2*  “  T2*D#*2  "  C2T(t)S(t)$2(t) 

ui*  "  ti*d#*2  +  Ti*Di  **2  "  ©^COKO*  (O 

+  clI(o|Ki(o  -  S(t)]?2(t) 

8  -  -  SF(t)  -  1^(08  +  8  |G1(t)G1T(t)  -  C2(t)G2T(t)|s  ; 

8(T)  -  1 

"  "^(t)  -  FI(t)Kl  ♦  *lGl(t)GlT(t)Hl;  l^CT)  -  I 

£  •  4,  •  |r  -  5,0^8  ♦  v,T.J4,  ♦  [«, -«A|; 


Tabic  5.2  (Continued) 


•3 


(3.90)  *nd  using  the  trace  operator  as 


(5.73) 


A  | 

Jl(ul*»u2*)  ■J(ul»u2)“"2  tr  NP+  /  H(*)P(*)H2T(a)W2'1(s)H2(e)P(s)d8j 


Taking  the  trace  of  Equation  (5.13)  and  collecting  terms,  we  can 


write 


tr  N 


tr  -N  jp(t)  -  P(t)H2T(t)W2'1(t)H2(t)  -  G1(t)GlT(t)S  ) 
-  |p(t)  -  P(t)H2T(t)W2"l(t)H2(t)  -  G1(t)G1T(t)S  ]T  N 
+  HG1(t)G1T(t)N  +  SG2(t)G2T(t)S  J  ;  tr  j  N (T)  |  -  0 


(5.74) 


The  above  equation  can  be  viewed  as  a  linear  differential  equation 
driven  by  the  term  tr  |  NG^  (t)G1T(t)N  +  SG2(t)Gj,T(t)S  j  ,  which  Is 
greater  than  or  equal  to  zero.  Since  the  terminal  value,  tr  |n(T)  J  , 
equals  zero.  It  follows  that  tr J  N(t)j  can  only  become  smaller  than 
zero  as  time  progresses,  and  we  conclude  from  Equation  (5.73)  since 


tr  I  P  I  and  tr 


lvl| 


are  positive  that 


J(u{,up  <  Vul’»u20) 


(5.75) 


To  study  the  relation  between  J  and  the  delayed  commitment 
secure  payoff,  J2,  for  player  2,  assume  that  P(t)  •  P2(t),  then 
subtracting  Equation  (5.69)  from  Equation  (3.90)  we  obtain 

J(uj,u2)  -  J2(ui*,u2*)  -  i  tr  J  |  N  -  (K,  -  S)|  P 


(Cont  *d) 


84 


(5.76) 


T 

+  /  |  HO  -  (^(s)  -  s  (■ ) )  ]  P(.)H2T(»)W2‘1(.)H2 (OP(t)d. 

t 

•  • 

But  from  Equation!  (5.13),  (5.60)  and  (5.48),  l.e.,  tha  M,  and  S 
equations,  and  using  the  trace  operator  we  can  write 

trjir-C^-S)  -  tr  -  [n  -  (1^  -  S)  j  F  -  PT  |n  -  (^  -  S)] 

-  «lGlCl1i^  -  (S  +  H)GlGlT(S  +  H)  +  NPHj’Wj"1!^ 

+  h2TW2"1h2PN  •  tr  H(T)  "  O^CC)  *  S(T))j  -  0  (5.78) 

From  our  earlier  observation  tr  { N(t)  j  <  0Vt<  T,  and  we  can  again 
view  the  above  equation  as  a  linear  differential  equation  of 
tr  |n  -  (^  -  S)  |  driven  by  the  term  tr  |  -  l^GjG^  -  (S  +  DG^1 
(S  +  N)  +  KPH2\2-1Hj  +  Hj^'^PN  |  which  Is  smaller  than  or  equal 
to  aero.  Since  the  terminal  value,  tr  J  N(T)  -  (N^  (T)  -  S  CT))j  ,  Is 
equal  to  zero,  It  follows  that  tr  |  M  (t)  -  (H^  (t)  -  S(t))J  can  only 
be  greater  than  or  equal  to  *ero.  Thus,  all  the  terms  In  Equation 
(5.76)  are  >  0  and  hence 

J2(ui*iu2*>  *  J(u’,u£)  (5.79) 

and  as  a  result  of  Equation  (5.75) 

J2(ui*»u2*>  £  J(«J»«j)  £  Ji(ui*»u2  >  (5.80) 


85 


Note,  that  If  W2(t)  i*  large,  or  If  during  the  game 
?2(0  —  0  ;  ?(t)  —  0 

then 

H(t)  — -  (Hj(0  -  S(t)) 

end 

j2<ui*.«2*>  — *  J(ui»  up 


The  relationship  between  the  various  payoffs  discussed  above  is 
shown  In  Figure  5.1 


Payoff 


1 r 
*  * 

^ r 

e  0 

V*1  >“2  ) 

j,(“j  ,»2  ) 

Gain  Floor 
(Player  2) 

Prior  Commitment 

Loss  Celling 
(Player  1) 

Figure  5.1  Relationship  Between  Prior  Commitment  and 
Delayed  Commitment  Payoffs 

It  is  immediately  clear  from  Figure  5.1  that  if  player  1  knows 

that  player  2  is  committed  to  strategy  u^  (t),  he  should  play  uj(t) 

and  similarly  for  player  2.  Thus  if  the  players  had  to  determine  at 

t  »  tQ  the  strategies  they  would  have  to  play  for  the  rest  of  the  game, 

uj(t)  and  u£(t)  would  be  the  proper  choice.  However,  as  we  have  seen 

in  our  tutorial  example  (Section  4.2)  as  soon  as  the  game  has  advanced 

to  a  time  t  >  t  .  u’ (t)  and  u'(t)  become  unsafe  strategies,  as  compared 
0  1  2 


86 


to  u^#  (t)  and  Uj  (t)  respectively. 

On  the  other  hand.  If  alther  player  1  or  player  2  commits  him¬ 
self  to  his  secure  strategy,  he  can  only  be  assured  of  his  secure 
payoff.  Thus  we  find,  as  is  usual  with  games  with  Imperfect  infor¬ 
mation,  that  the  players  should  keep  their  strategies  secret. 

The  actual  payoff,  Jq,  can  only  be  calculated  at  the  conclusion 
of  the  game,  l.e.,  when  everything  has  become  a  fact,  and  It  is 
calculated  from 

T 


J  -  r 

O  4 


(T)x(T)  +  J  Juj^Ou^t)  -  u2T(t)u2(t)  dt 


(5.81) 


which  depends  on  the  actual  values  of  the  control  functions  u^  (t)  and 
u2(t)  employed  during  the  game,  which  in  turn  depend  on  the  strategies 
employed  and  the  actual  values  of  w2(t). 


CHAPTER  6 


A  PURSUIT-EVASION  EXAMPLE 

One  of  the  differential  game  problems  most  easily  visualized  la 
the  problem  of  purauit-evaalon.  In  order  to  illuminate  the  results  of 
the  previous  chapters  we  will,  in  thiF  chapter,  analyze  a  pursuit- 
evasion  problem  in  two-dimensional  Euclidian  space  where  the  pursuer, 
player  1,  has  perfect  measurements  of  the  state  of  his  own  system  as 
well  as  that  of  the  evader,  while  the  evader,  player  2,  has  only  noise 
corrupted  measurements.  The  problem  satisfies  Behn's  [9]  require¬ 
ments  for  player  1  to  determine  exactly  the  error  in  player  2 'a 
state-estimate,  and  thua  allows  ua  to  compare  the  prior  and  delayed 
commitment  problem  formulations. 

As  mentioned  in  Chapter  5,  a  physical  example  of  this  problem 
is  a  homing  missile  and  an  evading  aircraft  where  the  missile  has  an 
inertial  reference  unit  which  allows  accurate  determination  of  its 
state  vector  and,  in  addition,  has  considerable  ground  support  via  an 
up-  and  downlink  to  determine  the  state  of  the  evader.  The  aircraft 
has  only  noise  corrupted  measurements  of  its  own  inertial  reference 
system  and  of  the  missile  from  noise  corrupted  radar  measurements. 

6.1  PROBLEM  FORMULATION 

The  space  diagram  showing  the  geometric  relationship  between 
missile  and  airplane  or  target  during  the  encounter  are  shown  in 
Figure  6.1.  The  missile  and  target  velocity,  and  respectively, 
are  assumed  to  be  constant.  Gravity  effects  have  been  neglected  and 
the  encounter  is  assumed  to  be  restricted  to  the  x-y  plane. 


68 


2/2 


7 


AD-A124  £34 


STOCHASTIC  DIFFERENTIAL  GAME  TECHNIOUES(U)  CALIFORNIA 
UNIV  LOS  ANGELES  SCHOOL  OF  ENGINEERING  AND  APPLIED 
SCIENCE  B  HONS  HAR  82  DASG£a-8B-C-ee87 


UNCLASSIFIED 


F/G  12/1  NL 


Figure  6.1  Geometry  of  the  Pursuit  •  Evasion  Problem 


The  fundamental  re la t Iona  governing  mlealle  and  target  patha 
are  the  velocity  e qua t Iona  [22]  . 

*M  ■  TMCO,»M 

(6.1) 

-  VT  coa  yT 
yT  ■  VT  Bin  Yt 

The  anglea  are  aubject  to  change  alnce  both  mlaalle  and  target  are,  of 

courae,  free  to  maneuver  In  the  x-y  plane.  At  en  arbitrarily  aelected 

time  t  -  tQ,  the  anglea  yM  and  Yt  have  acme  Initial  valuea  Y^  and 

Yto  and  at  a  later  time  t  are  perturbed  by  email  amounta  yb  and 

y  to  y  and  y  reapectlvely,  while  the  line  of  alght  haa  changed 
"  t  M  T 

from  zero  at  t  •  t  to  a  .  Under  these  conditions,  the  Instantaneous 
o 

anglea  of  the  velocity  vectors  are 

V‘>  *  ymo+  v.m'  (6-2) 

and 

Yj (t)  -  Yjo  +  Yt(t)  (6.3) 


ao  that  the  linear  velocity  componenta  are 


*M*VM 

cos 

yM0  ‘ 

Vh 

sin 

ymo 

*M"VM 

sin 

yM0  + 

*.VM 

cos 

*mo 

cos 

rT0  ’ 

?tVT 

sin 

yto 

/t“vt 

sin 

rT0  + 

Vi 

cos 

VT0 

where  the  Mall  angle  approximations  sin  y  ■  y  and  cos  y  •  1 
have  been  used. 

If  we  as suae  that  the  missile  and  target  are  initially  on  a 
collision  course,  l.e. 

Vy  sin  y “  Vjj  sin  y^  (6.5) 

then  using  this  equation  and  Iquatlon  (6.4) 

(6.6) 

*T  •  iM-VI  "*  yjo  ■  VM  C0*  »»  *  <yt  •*"  ym 

If  we  neglect  the  difference  term  involving  yt  -  y  ,  the  closing 
velocity  Vc  Is  given  approximately  by 

-  V«t  ‘  N~Tt  C°*  »t0  *  VM  l' MO  (6-7> 

and  in  view  of  the  assumed  constant  velocities 

nj(t)  -  x^t)  -  VC(T  -  t)  (6.8) 

Since  only  the  relative  positions  of  the  missile  and  target  need  to  be 
known;  i.e. ,  xf  (t)  -  xT(t)  -  x^t)  and  yr(t)  -  yT(t)  -yM(t) ,  the  relative 
missile-target  position  Is  uniquely  specified  by  giving  the  time  t 
and  the  projection  of  M  and  T  on  a  line,  L,  perpendicular  to  the 
initial  line  of  sight.  Thus,  the  original  problem  has  been  changed 
from  a  two-dimensional  Intercept  problem  with  unspecified  final  time 
T  to  a  one-dimensional  Intercept  problem  with  a  final  time 


91 


(6.9) 


92 


a  ,  n(*  are  the  lateral  missile  and  target  aceeleratloos  respectively 
In  G's. 

Nov  let  uj(t)  and  Uj(t)  be  the  missile  and  target  called  for 
accelerations  respectively,  and  let  us  assume  the  following  missile  and 
target  system  transfer  functions 


nl(«) 


1 

TjS  +  1 


Pl<*>  .  _1 _ 

u‘(s)  ’  V  +  1 


(6.15) 


where  both  and  T2  era  positive  real  numbers.  The  resulting 
equations  of  motion  under  the  above  used  assumptions  of 

1.  Constant  target  and  missile  velocity 

2.  9,  Y  and  y  are  small  angles 

■  v 

are  then 


(6.16) 


(6.17) 


95 


If  we  define  Che  vectore 


with  Initial  conditions 


'!<*«> 


rT0>) 

V_  sin  Y__ 
T  TO 


or 


y2-F‘y2  +  G'u*  ;  72(t0>  -  y2 


and  with  a  final  tlae 


(6.23) 


(6.24) 


T  - 


*1^0*  ~ 
V 


(6.25) 


Player  1  has  perfect  measurements  of  his  own  and  his  opponent's 
state  vector,  while  the  measurements  of  player  2  are  of  the  fora 


a'(t)  -  H1(t)yl(t)  +  w{(t) 
*2  (O  ■  H2(t)y2(t)  +  w£(t) 


(6.26) 


where  w£(t)  and  w£(t)  are  Gaussian  white  noise  vectors,  with  sero 
aean  and  with 


COV  (wj(t),  w{(T)l 
coy  (wj(t),  w2 (t) ) 


Wt'(t)  6  (t  -T) 
W2'(t)  6  (t  -T) 


(6.27) 


COV  lw.'  (t),  wI(T)  ) 


C(t)  6  (t  -t) 


In  addition,  lat  the  payoff  criterion  be  given  by 

*2  [ylW  "  y2W]  T  [yl(T)  “  *2^1 

T 

+  /  -  u'T(t)R2(t)u*(t)|  dt  (6.28! 

to 

2 

where  both  R^(t)  and  R^ (t)  ara  positive  definite  and  a  le  Introduced 
to  allow  for  weighting  of  temlnal  alas  against  energy. 

The  above  foneulatlon  will  now  be  recognised  as  the  classlcel 
Interception  problem  In  Euclidean  space;  l.e.,  player  1,  the  pursuer, 
attempts  to  Intercept  with  player  2,  the  evader,  at  some  fixed  time  T 
while  the  latter  tries  to  do  the  opposite.  Both  pleyers  have  limited 
energy  sources  and  do  not  care  about  the  difference  in  the  velocities 
of  the  two  players  at  the  terminal  time. 

From  the  point  of  view  of  the  criterion,  the  number  of  "Inter¬ 
esting"  variables  are  the  same  as  the  number  of  control  variables. 
Hence,  this  formulation  of  the  game  basically  satisfies  Behn's 
criterion  for  the  ability  of  player  1  to  determine  the  error  In  the 
state  estimate  of  player  2. 

If  we  define 

x’(t)  -  \ly  |  0  0 J  |*l(T,t)y1(t)  -  ♦2CT,t)y2(t)j  (6.29 

where  #^(t,r)  and  #2(t,r)  are  the  transition  matrices  for  player  1 
and  player  2  respectively,  then  x'(t)  represents  the  temlnal  miss 


predicted  et  tlae  t  on  the  beets  thet  no  control  la  epplted  during  the 
Interval  |t,  tJ. 

The  above  transformation  allows  us  to  reduce  the  dimension  of 
the  problem  since  on  taking  the  derivative  of  Equation  (6.29)  end 
using  (6.24)  end  (6.25)  we  obtain 

f  J^CT.t) 

*'(t)  -  [  i  !  o  oj  ^ - 7i(t)  +  •1<r,t)<r{71  +  cjup 

'  **,■'"  yiM  '  *2<1>t)(?2),2  +  VP 

-  [l  i  o  oj  |*l(T,t)G[u'  -  *2CT,t)G^| 


where 

6{(t,T)  -  -  [i  !  0  oj  [♦ia,t)Gj  j 
Cjd.T)  -  “  [  I  !  0  O]  [♦2CT»t)GjJ 


(6.32) 


while  the  performance  criterion  In  teems  of  x’ (t)  Is  . 


97 


For  our  example,  tho  transition  matrix  of  the  system  of  player  1  Is 


•iCT.t)  - 


1  T 

0  1 

0  0 


■Vi2(l  ■  *"T/Tl-  ^Ti> 

KlTl(l  ‘  « ”^/Tl) 

-T/ti 


where 


T  ■  time-to-go  ■  T  -  t 


(6.34) 


The  transition  matrix  for  player  2  Is  the  same  as  that  for  player  1 
with  the  subscripts  1  replaced  by  2. 

From  Equation  (6.32),  G[ (t,T)  and  G£(t,T)  are  scalars  and  are 
given  by 

G{(t,T)--|l  0  Oj^d.t) 

-  +  Kj  1^(1  -  e'T/Tl  -  T (6.35) 

and 

G£(t,T)*-|i  0  0  J  #2  (T,t) 

-  4  Kj  r2(l  -  *”t/T2  -  f/r2)  (6.36) 


0 

0 

1/T1 


M 


Thus, 


(t)  - 


-  Kj  T^l  -  • 


-  T/T^u’  +  K2  T2(l  -  •  4  -  T/T2)u* 


*'“<>>  -  M>  +[T  '  Co|  |VH  ,1"yH0  -  VI  ,l"  ym\  (6-J8) 


W  -  V'o) 


(6.39) 


With  the  dynamical  system  reduced  to  Equation  (6.37),  the 
meesurements  of  pleyer  2  must  be  reduced  to  meesuresieats  ou  x'(t).  If 


we  define 


(6.40) 


*"(t)£[i  o  o]  [*lcr.t)Hl“l(t)*j[(t)  -  #2CT»t)H2"1(t)*'(t)] 

then  using  Equations  (6.26)  and  (6.29)  we  can  write 

«;(t)  -  [i  0  o|[#l(r,t)(yl  +  Hj"^')  -  #2cr.t)(y2  + 

-  x *  (t)  +  [l  0  o]  [♦1Cr,t)Hl’lv[  -  #(T,t)H2'lwpj  (6.41) 


*”(t)  -  x'(t)  ♦  w”(t) 


(6.42) 


where  the  sero>mean,  white  noise  process  w2'(t)  Is  given  by 
w£(t)  -  [l  o  o]  [♦l(r,t)Hl"lw* 


(6.43) 


with 


(6.44) 


WJ-[l  0  o]  ♦lCT,t)H1"1W'H1T  ♦  jtf.t)  -  #2(T,t) 


-  *2(T,t)H2’lC*HlT  ^(l.t)  +#2(T,t)H2*1W^H2T  ♦2(T,t)|[l  0  oj 


If  we  define  the  energy  weighting  matrices  R^(t)  end  ^(t) 

2  2 

which  In  this  cese  ere  scelere  by  r^  (t)  end  r22  (t)  respectively, 
then  on  using  the  trens format Ions 


*  ■  sx 


ui  "  rnui 


(6.45) 


u,  -  r.-u' 
2  22  2 


we  can  write  the  system  equations  as 


x  ■ 


■C^T.t) 

—j* r  ui(t)  + 


*G2(T,t) 
““(t)-  U2^> 


(6.46) 


•*2'(t)  -  x(t)  +  ew2'(t) 


(6.47) 


Defining 


•G{CT,t) 


**1  Tj  r 


C! (t>  “  r 


11 


ll 


,  -T/Ti 
1  -  e  -  T/r. 


C2(t)  "  r 


*C2 (T»t) 


22 


«K2  T2  I  -T/t2  _ 

-rr *-  1  -  •  -  i/t2 


22 


(Cont  *d) 


100 


I 


«2<t)  -  Mj(t) 


w2(t)  -  aw£(t) 

W 2«  (t  -T)  -  »2W£  (t  -T). 


(6.48) 


We  have  the  original  problem  reduced  to  the  notation  uaed  In  thla 
paper,  l.e.,  the  dynamic  aystem  la 


(6.49) 


*(t)  -  - 


**1  yl 
rll 


1  -  «“T/Tl  -  t/t, 


l  aK2T2 

Ul(t)+“^ 

22 


1  -  tA. 


u2(t) 


*2(t)  -  *(t)  +  *2(t) 


(6.50) 


with  Initial  condition 


'<*•>  *  *o 


(6.51) 

'  *  (’k<#>  '  V°>|  +  I1  ‘  'o|  |TM  VM0  '  VT  ,ln  Yto| 


for  player  1  and  an  a  priori  eetlmate  of  x(tQ)  for  player  2. 
and  with 

*T(°)  -  Vo> 

T  “  - S - 


(6.52) 


Or 


*(t)  -  -  Cjtou^t)  +  C2(t)u2(t)  ;  x(tQ)  -  *c 

*2^)“  «2*(t)+w2(t) 


(6.53) 


and  the  performance  criterion  la 


T  (6.54) 

J(**l»«^)  "  2  1  J*TCT)*CT)  +  /  [u^tJUjCt)  -  u2T(t)u2(t)J  dtj 


101 


i 


Note  that  In  addition  to  our  simplifying  assumptions  of  small  angles 
V  and  a,  and  constant  velocities,  ve  have  implicitly  assumed  that 
player  2's  Initial  estimates  are  such  that  the  final  time  T  Is  the  s 
for  both  players. 

6.2  DELAYED  COMMITMENT  SOLUTION 


With  reference  to  Table  5.2,  the  delayed  coomltment  strategy 
for  player  1  Is  given  by 


Ul*(t)  "  ClT(t)S(t)x(t) 


where 


2  2  2 
a  K  T 

*  i  1 


-T/Tl  _2 

a  -  *  -  n\y 


A* 


-T/T2  —  2  2 

(1  -  a  -  T/T2)z  sZ 


S(T)  -  I 


The  above  equation  Is  separabla  and  has  a  closed  form  solution;  l.e 


Tr  .2  2  2 


s"1^)  -  l  + 


-T/T1  _  2 

a  1  - 


•V 22  2  "T/T2  _  2  1 

- 2-4—  (1  -  a  2  -  T/T_)2  |( 


(6 


•ad  S(t)  la  found  to  ba 


■  ,ruV'  [  fcnS**  +  *V*222  K*5  -  »v  2  +  2? 


+  3x^(1  -  a'2^/Tl)  -  12T1a5i'*/TlJ 


■a2K22rl-2  [bTj2?  -  fiXjT2  +  2T  3  +  3t23(1  -  e 


-2?/t2 


-  12  T, 


2  -  -T/t2  | 

2  T*  1 


(6.59) 


Thus,  the  optimal  delayed  commitment  control  function  for  player  1 
la  glvan  by 

Ul*(t)  -  GlT(t)S(t)*(t) 

-  6«lVllr222a  -  a'T/Ti  -  T/rpaCt)/ 


j6rll2r222  +  *2,:i2r222  |6Tl2^  ‘  6ti^  +  2T  3+  3X^(1  -e 

-liT^a"17  l|-  •2*22r112  [6x22T-6  2?2+  2T3 
+  3x23(1  -  e'2*^2)  -  12X22Te  '  2  j  | 


-2T/ti 


(6.60) 


The  corraapondlng  optimal  atratagy  for  player  2  at  time  t  la  then 


u2*  (t)  -  C2T(t)8(t)x(t) 


2  ”T/T2  — 

-  6>K2T2rll  r22^1  ’  *  ’ 


(Cont'd) 


103 


6rll2r222  +  -2,Cl2r222  l67!2^  ’  6V2+  2T3  ♦  ST^Cl  - 1 

-  .  •2K22rn2  (6t22?  "  6V2  +  22  3 

3  -2T/T2  2-  "t/T2  I  I 

+  3Tj  (1  -  a  Z)  -  12T2*Te  Jj 


2„_  -2T/Ti, 


-2T/T2  2-  "t/T2 

Z)  -  12Tj  xe 


(6.61) 


and  the  secure  payoff  for  player  1  at  tl*e  t  la 


Jl(ui*»u2#) 


£xT(t)S(t)x(t) 


-  3rn2r222*2(t)/  |  6rn2r222  +  *\\22 

-  6^?*+  2T3+  3T13(i  -  a  2^/  l)  -  l2Tt2  T*  **  *] 


-  a2K22r112  -  6t^  +  21 3  +  3t23(1  -  a 


-2T/r2k 


2  -  -T/V 

-  12T2z  Te 


(6.62) 


The  delayed  coamltMat  strategy  for  player  2  at  tlae  t  la 


given  by 


u2-(t>  -  C2T(t)S(t)x2(t) 


(6.63) 


which  la  simply  equation  (6.61)  with  i(t)  replaced  by  x2(t).  The 
corresponding  optimal  strategy  for  player  1  at  time  t  Is  then 

"  C!T<OS(t)x2(t)  ♦  C1T(t)V1(t)a:2<t)  (6.64) 


where  V  (t)  satisfies, 


104 


(6.68) 


If  we  •••time  that  the  noise  variance  for  our  example  !•  given  by 


cov  |v2<t),w2(T)|  -  |*2, 
and  that  the  measurement  matrix  of  player  2  Is 


(6.69) 


Hj  -  h2  (a  scalar)  (6.70) 

then  the  expected  secure  payoff  for  player  2  at  time  t  Is 


J2 <ui°*u2*> 


.}x2T(t)S(t){2(t)  +i?2(t)TNl(t)S2(t) 

+  2  tr  /  [Nl(i)  "  S<*>  |  P22(Oh22(s)W2"lds 
Lt 

-  j6ruV +  H2?  -  6v 


-2 


+  2TJ  +  3T,-(1  -  e  '  L)  -  12T  2  I. 


"21/1*1, 


_  -T/T, 


222  2—  —2  —3  3 

-  *  *2**11  6T2  T  "  6V  +  2T  +  3t2  (1  "  * 


■271*2. 


2 -  "T/T, 

12T2^re  * 


1 1  +  *nWMI 


6tn2  +  *V 


3..  .  -2T/T1, 


6T^3t  -  +  2T3  +  3T^(1  -*  e 


-  12  T^Te  T/Tl 


I  +2  “  [  /|V>  -sH 


P22(s)h22(s)W2’lds 


(6.71) 


Note  that  we  could  obtain  the  gain  coefficients  of  the  controls  for 
player  1  and  player  2  In  the  delayed  commitment  solutions  In  closed 
form  because  all  the  differential  equations  Involved  In  the  computa¬ 
tions  are  Initial  value  problems.  Furthermore,  as  can  be  seen  from 
Table  5.2,  the  coefficients  of  the  filtering  equations  can  also  be  pre¬ 
computed,  l.e.,  they  can  be  calculated  off-line,  and  arc  again  simple 
initial  value  problems. 


6.3  PRIOR  COMMITMENT  SOLUTION 

In  the  eeee  of  the  prior  commitment  strategies,  the  gain  matrix 
for  the  error  tern  In  player  l's  control  Is  coupled  to  the  error 
covariance  matrix  of  the  Kalman  filter.  Let  us  assume  that  the  co- 
variance  of  the  error  of  player  2's  Initial  estimate  Is 


Po  “  po 


(6.72) 


then  from  Table  5.1,  the  following  set  of  simultaneous  differential 
equations  are  found  for  the  second  term  In  player  2's  control  and  the 
nature  of  player  2's  estimator, 


N  - 


2_  2.  2 

T2 


22 

2  2  2 
S  K,  T, 


1  -  e 


-T/T2  r/.  |2„2 


T/T2  S2(t) 


a  K.  T.  ,  -T /T.  I  2  (  2 

+ - I  l  -  e  1  -  T/tl  ]  |N(t)Z  +  2S(t)N(t) 

rll 


2P2h2 


H; 


(6.73) 


P  -  -2 


N(T)  -  0 

2r  2 

*1  1 


1  -  e 


-T/tl  _ 


11 


rnm 

*  T/Ti  |s(t)+«(t) 


2  2 
P4h 


(6.74) 


P<‘o>  "  po 


Note  that  we  are  now  faced  with  solving  a  nonlinear  two-point 
boundary  value  problem.  Experience  has  shown  that  such  a  problem  If 
solved  directly  Is  very  sensitive  to  the  error  of  the  unknowns  or 


107 


guessed  Initial  conditions.  Frequently,  the  guessed  value  of  the 
missing  Initial  condition  has  to  be  prectlcaliy  the  correct  value 
before  the  p rob lam  will  converge.  Hence,  we  have  to  resort  to  such 
computational  techniques  as  quaslllnearlzatlon  or  Invariant  imbedding 
to  solve  the  above  equations,  thus  greatly  Increasing  the  computational 
load  as  compared  to- simple  initial  value  problems. 

Leaving  H(t)  and  P(t)  undetermined,  the  optimal  prior  commitment 
strategy  controls  are  then  given  by 


=r2 


u[  (t)  -  G1T(t)S(t)x(t)  +  G1T(t)N(t)x2(t) 

-  6aKlTlrllr222(l  -  e  T/  1  -  T/T1)x(t)/ 

[‘'uSz2  +  *Vr222  K2  *  *  6T1T *  +  « 

+  3Tj3(1  -  .‘T/  l>  -  12T(2  f.'T/Tl)  -,2)C,2rn2 

I 


(6.75) 


6t22  T  -  6T2T2  +  2T  3  +  3T23(1  -  e 


^1*T1  ”T/Ti  _ 

+  d-«  -  T/T  )H(t)x,(t) 

rll  1  * 


and 


u2‘  (t)  -  G2T(t)S(t)x2(t) 

-  ««R2T2rll  r22(l  '  *  2  '  ^T2>*2(t)/ 


6rll2r222  +  *2,tl2r222 


|  6t^2T  -  2  +  2T3 


-21/^ 


♦  ST/Cl  -  e  *)-  l2Tl2re’T/Tl 


(Cont'd) 


108 


l2*22rll2  \6r2**  '  6V2+  W3  +  3T23(1  -  2 


-12T22  T.‘T/T2] 


The  expected  prior  commitment  payoff  et  time  t  It  given' by 


(6.76) 


J(u[  #“2  )  “ 


-  j  xT(t)S  (t)x(t)  +  i  x2T(t)H(t>;2(t) 

+  2  tr  /  H<T>p<T>H2T(T?W2’l<T)H2(T)P(T)dT 
•■t 

-  ill'll2*2™'  \6tUr222  +  *Vr222  K**  "  6T1^2 

+  2T3  +  3X^(1  -  e  1  )  -  l2T12Te  **  *]  -a2^2^2 

^Tj2?  -  6T2T2+  3t23(1  -  e’2*7  2  )-  12  22Te"T/T2  ]  I 

T 

+  |«(t)?22(t)+|  /  K(8)P2(s)h22W2"1d.  (6.77) 


6.4  NUMERICAL  EXAMPLE 

In  this  section  ve  pretent  e  numerical  example  of  the  pursult- 
evaalon  problem  dlacutied  In  the  previous  sections. 

2 

In  the  selection  of  parameters,  the  specification  of  a  , 

2  2 

•  r^  and  Rj  ■  in  the  performance  criterion  (Equation  6.33) 

has  to  be  such  that  the  terminal  miss  Is  acceptably  small,  and  pro- 
duces  tolerable  levels  of  control  for  the  missile  and  the  aircraft. 

A  choice  that  frequently  results  In  acceptable  levels  are(  23): 


109 


i.*ri 
(mV1 
M  '* 


1  2 

■  maximum  acceptable  value  of  |x*  -  «2  | 

■  T  x  maximum  acceptable  value  of  |  uj  J  2 
»  T  x  maximum  acceptable  value  of  |  |  2 


If  ve  assume  a  final  time  X  of  10  sec.  and  a  maximum  missile  accelera¬ 
tion  of  10  G'e,  then  r^j2  ■  .001 (G2  -  sec.)  *.  Similarly,  for  a 

2  2-1 

maximum  airplane  acceleration  of  5  G's,  m  *004  (G  -  sec.) 

2  2-1 

Assuming  a  terminal  separation  of  5  ft.,  a  -  .04  (ft.  )  .  The 

constants  and  parameters  used  are  summarized  in  Table  6.1. 

2  2 

By  assuming  that  r ^  <  r22  ve  assure  that  the  relative 
controllability  requirement  discussed  In  Chapter  3  (Equation  (3.86))  Is 

e 

satisfied.  From  the  equation  for  N  In  Table  5.1  and  player  2's 
estimation  equations  In  both  the  prior  commitment  and  delayed  commitment 
solutions  (Tables  5.1  and  5.2),  ve  see  that  the  range  of  possibilities 

of  the  nature  of  Information  available  to  player  2  depends  on  the  ratio 

FH  p2«2 

■■  in  the  prior  commitment  or  — “ -  in  the  delayed  commitment 

W2  2 
game.  We  have  Investigated  the  effect  of  the  nature  of  the  measurement 

Information  of  player  2  to  the  game  by  varying  W2  ever  a  range  from 

10  to  10A  ft.2. 

To  obtain  the  results  for  the  prior  commitment  solution  required 
the  solution  of  a  non-linear  tvo-point  boundary  value  problem.  The 
quaslllnearlzatlon  technique  vas  used  to  obtain  the  solution.  It  vas 
found  that  four  iterations  vere  sufficient  to  converge  to  the  solution. 

All  solutions  vere  obtained  on  a  Control  Data  Corp.  6400  digital 
computer  using  a  fourth-order  Bunge-Kutta  Integration  technique  vlth 


110 


TABLE  6.1 


CONSTANTS  AND  PARAMETERS  USED  IN  A  NUMERICAL  EXAMPLE  OP  A 
PURSUIT-EVASION  GAME 


■  Pinal  time 

■  Initial  time 

-  32.2  coa 

-  32.2  coa  yTO 

•  Mlaalle  time  constant 

■  Airplane  time  conatant 

-  Terminal  mlaa  weighting  factor 

•  Mlaalle  control  weighting  factor 

■  Airplane  control  weighting  factor 

-  Initial  error  covariance 

•  Meaaurament  nolae  covariance 


*  10  aee. 

•  0  sec. 

•32.2  ft/aec2  -  G 
-32.2  ft/sec2  -  G 

-  1  sec. 

-  2  sac. 

-  .04  (ft2)'1 

-  .001  (G2  -  sec)'1 

-  .004  (G2  -  see)'1 

-  100  ft2 

4  2 

-  10—10*  it. 


Ill 


v* 


»„• 

i 


hiSiiMU 


-  *■  J  f  il^WwiMkt 


an  Integration  interval  of  .01  seconds.  A  listing  of  the  computer 
program  is  presented  in  Appendix  A.  No  attempt  has  been  made  to 
optimise  the  computer  program. 

The  error  variance  of  player  2,  P2(t),  In  the  delayed  commitment 

game  Is  shown  for  various  values  of  W2  in  Figure  6.2.  The  error 

variance,  P(t),  in  the  prior  commitment  game  Is  shown  In  Figure  6.3 

for  the  same  range  of  values  of  W2>  The  delayed  -  and  prior  commitment 

error  variances  differ  at  most  by  3.2  percent. 

The  feedback  gains  G1(t)S(t),  G2(t)S(t)  and  G1(t)N(t)  for  the 

example  from  sero  to  7.5  seconds  are  shown  In  Figure  6.4  and  on  a  less 

sensitive  scale  from  7.5  seconds  to  terminal  time  at  10  seconds  In 

Figure  6.5.  The  curves  for  G1(t)S(t)  and  G2(t)S(t)  are  of  course. 

Independent  of  W2,  but  It  was  found  that  G^(t)N(t)  is  also  appropriate 

4  2 

for  all  values  of  In  the  range  from  10  to  10  ft.  .  Near  the 

terminal  time  G^(t)N(t)  Is  completely  Independent  of  and  varies 

less  than  .1  percent  at  t  -  5  seconds  for  the  range  of  W2  Indicated 

above.  This  Is  clear  from  the  equation  for  N  in  Table  5.1  which  shows 

that  W2  effects  N(t)  through  the  term  P/W2  and  for  the  latter  half 

of  the  game  P(t)  Is  so  small  that  W2  cannot  have  an  appreciable  effect 

on  N(t).  Only  near  the  beginning  of  the  game  does  G^(t)N(t)  vary 

with  W2  but  Its  value  is  so  small  that  It  cannot  be  displayed  on 

Figure  6.4.  At  t  ■  0,  the  values  for  G^(t)N(t)  are  given  In  Table  6.2. 

The  curve  for  G^(t)  J  N2 <t >  -  S(t)j  follows  that  of  G1(t)N(t) 

so  close  as  to  be  Indistinguishable  on  Figures  6.4  and  6.5,  the  values 

2 

are  compared  at  various  times  for  V2  ■  100  ft.  In  Table  6.3.  After 
t  ■  6  seconds,  the  two  values  are  Identical  to  four  decimal  places. 


112 


8  8 


8  ? 

33NVIUVA  H0WW3 


113 


Figure  6.3  Error  Variance  of  Player  2  in  Prior  Commitment  Game 


FEEDBACK  GAIN  ~  (SECONDS)'1'2  X  10' 


FEEDBACK  GAIN  ~  (SECONDS) 


TABLE  6.2 


VALUES  OF  Gl(C)M(t)  AT  t  -  0 


W2 

G^O)  N  (0) 

FT2 

(SEC)"  1/2 

10 

.1864  E-05 

io2 

.1432  E-04 

103 

.3416  E-04 

io4 

.3944  E-04 

TABLE  6.3 

COMPARISON  OF  G^(t)N(t)  WITH 
Gi (t)  |  H1(t)  -  S(t)  |  FOR  W2  »  100  FT2 


TIME 

SEC. 

GjCON  (c) 

(SEC)'  1/2 

G^t)  [  Nx(t)  -  S  (t)] 
(SEC)"  1/2 

0 

.1432  E-04 

.4013  E-04 

.5 

.2444  E-04 

.4408  E-04 

1.0 

.3418  E-04 

.4866  E-04 

1.5 

.4365  E-04 

.5403  E-04 

2.0 

.5313  E-04 

.6037  E-04 

3.0 

.7394  E-04 

.7714  E-04 

4.0 

.1014  E-03 

.1026  E-03 

5.0 

.1443  E-03 

.1447  E-03 

6.0 

.2252  E-03 

.2252  E-03 

117 


The  difference  betveen  Che  prior  commitment  payoff,  J,  end  the 


delayed  comnltment  peyoff  for  player  1,  J^,  show*  the  dependence  of  J 
on  Wg  and  la  defined  In  this  paper  aa  the  relative  criterion  of  the 
prior  commitment  game.  The  relative  criterion  for  the  delayed  commit- 

ment  game  la  obtained  by  taking  the  difference  betveen  and  J^. 

3  2 

The  relative  payoffs  for  a  of  10  ft.  are  shown  In  Figure 
6.6.  The  relative  payoffs  are  always  negative,  Indicating  a  reduction 
in  player  2 'a  payoff  compered  to  the  perfect  Information  game.  Further 
more,  the  relative  payoff  for  the  delayed  commitment  game  (J^  -  J^> 

Is  more  negative  than  that  of  the  prior  commitment  game  (J  -  J^) 
indicating  the  relationship  between  the  payoffs  as  discussed  In 
Section  5.3  (see  Figure  5.1). 


I 


I 

I 


I 


a 


t 


f- 


9.0l  X  340AV4  3Ati.VT3U 


119 


f 


ri- 


CHAPTER  7 


THE  NOISY/NOISY  DIFFERENTIAL  GAME 

In  this  chapter  we  extend  the  preeentetlon  of  the  prevloua 
chapter*  where  either  both  player*  or  only  one  player  had  perfect 
•tate  information  to  the  case  where  both  players  have  nolee  corrupted 

aeasureaents. 

Since  both  players  ere  feced  with  the  problem  of  extracting 
useful  information  from  their  noise  corrupted  aeasureaents ,  and 
neither  player  can  determine  exactly  his  opponent's  estimation  error, 
we  are  led  In  the  prior  coaaltaent  formulation  to  the  addition  of 
correction  terns  in  each  player's  controller  and  thus  Initiate  the 
vicious  cycle  of  estimates  of  estlmetes. 

The  problea  formulation  for  this  chapter  Is  as  defined  In 
Section  4.1,  The  basic  equations  are  repeated  below,  but  for  a  more 
careful  definition  the  reader  la  referred  to  the  above  mentioned 
section.  The  dynamic  system  Is  described  by 

*(t)  -  -  F(t)x(t)  -  G^Ou^t)  +  G2(t)u2(t) 

Xj(t)  -  Hl<t)x(t)  +  Wj (t)  (7.1) 

*2(t)  -  H2(t)x(t)  +w2(t) 

The  noise  processes  |w^(t)|  and  jw2(t)|  are  white  Gaussian,  with 
properties 


120 


(t), 

WjCt). 


(f) 

»2(t) 


-  W^t)  A(t  -T) 

-  W2(t)  6(t  -r) 


-  o 


(7.2) 


Por  simplicity  it  Is  assumed  that  both  players  consider  the  Initial 


condition  x(t  )  to  be  a  Gaussian  random  variable,  uncorrelated  for 
o 

all  t  with  w,  (t)  and  v. (t),  and  having  a  mean  of  x  and  a  covariance 
i  <  o 


(7.3) 


[x(to),x(to)|  Po 


The  cost  functional  or  payoff  to  the  game  Is  quadratic: 

(7*4) 

T  T 

J(“l,U2*  "  2  *  j*T(D*CT)  +  f  u^COu^tJdt  -  J  u2T(t)u2(t)dtJ 


The  class  of  admissible  strategies  are  restricted  to  those  and  U2 
which  give  rise  to  the  feedback  control  laws 


U1  5  U1  "  (*>•*> 
U2  :  u2  "  “2 <*2  ****** 


(7.5) 


The  delayed  commitment  strategy  to  the  above  defined  stochastic 

differential  game  Is  obtained  In  Section  1  for  player  1  and  In 
Section  2  we  obtain  the  delayed  commitment  solution  for  player  2. 


10 


ITMEMT  SOLUTION  FOR  PLAYER  1 


From  the  point  of  view  of  the  minimising  player,  player  1,  the 
performance  criterion  during  the  actual  play  of  the  game  at  time  t  Is 


121 


T  (7.6) 

*TCT)*tt)  +  /  I^V^Cr)  -  u2T(r)u2(r)]  dr  |  Z^t) 

e 

and  ha  obtains  hla  aacura  strategy  by  finding  cha  aadd la-point  solution 
to  above  aquation  aubjaet  to 

a 

x  ■  r(t)*(t)  -  CjttJUjCt)  +  C2(t)u2(t)  ;  *(t0)  -  *0  (7.7) 

Similarly  to  our  assumption  In  Chaptar  5  ve  assume,  for  the  purpose  of 
determining  player  l's  aacura  strategy  solution,  that  the  allowable 
strategy  for  player  2  In  addition  to  being  Z2(t)  measurable  Is  also 
Z^(t)  measurable.  Thus  ve  vent  to  determine  that  Uj*  € and 
u2°t  Uj  x  U2  which  are  optimal  in  the  sense  that  for  all  te  |C0>T| 

Ji(ui#>u2)  <  VV’V*  SJl<Vu2#)  (7.8) 


Hence  from  player  l's  point  of  view  of  a  secure  strategy,  player  2 
maximises  at  t  >  t 


122 


But  for  Arbitrary  t  *  to,u^(t)  and  u^(t)  wa  can  write  the  aolution  to 
the  eye tea  Squat Ion  (7.7)  aa 

T  (7.11) 

*(t)  -  #(t,t#)*#  +  J  |-  *(ltT)Gl(T)ul(r)  4-#(T,r)G2(T)u2(r)  J  dr 

*o 

where  #(t,t0)  la  the  atate  transition  matrix  which  must  satisfy  the 
relation 

0#(t,O 

- 2~  .  p(t)  #(t.t  )  (7.12) 

8t  0 

#(t  .t  )  -  1 
o  o 

Hence  In  terms  of  the  Hilbert  space  notation  developed  In  Chapter  3 
we  can  write 

u  ,r.B  2  '  |<*%  -  Vl  *  *  *lnl  +  V*> 

+  <ul’ul>  "  <u2,u2>  I  zi Cfc> .Z2 (t> |  (7.13) 

If  we  now  define 

P(t)  -  E  |(x  -  ft)(x  -  ft)T|  Z1(t),Z2(t)|  (7.14) 

where 

ft  -  E  |x(t)  |  Zl(t),Z2(t)j  (7.15) 

and  consider  the  tern  E  |<#xo,#xe>  |  Z^(t)Z2(t)J  of  Equation 
(7.13),  than  we  can  write  for  arbitrary  t  •  t0 


123 


*  !<#*»•*>  |  Z1(t)»Z2<t>|  »  B  |<*(x  -  x  +  x),  *(*  -  *  +  x)> 
|S1(t).l2(t)| 

-E  ]<♦(*-  x),  ♦(*-  x)>  +  <*(x  -  £),♦*> 

+  <#$.  #(x  -  $)>  +  <#$,#$>  |  Z^O.SjCOl  (7.16) 

But,  the  two  Middle  term*  In  the  above  expression  ere  equel  to  aero, 
while  the  first  tens  can  be  written  as.tr  |#T#pJ  ,  thus 

E  |  <#x,  #x>  |  Zl(t),Z2(t)  |  -  tr  [#T#p]  ♦  <#x,  #x>  (7.17) 

and  we  can  rewrite  Equetlon  (7.13)  ea 

“*  2  |<**0  ’  Tlul  +  T2U2*  **o  “  Tlttl  +  T2U2^ 

u2  €  Ci  *  «2 

+  <Vul>  *  <u2,u2  >  |  +2  tr  l#T#Pol  (7*l8) 

However,  tr  |#T#Po|  Independent  of  the  control  Ug,  thus  aax lair¬ 
ing  Equation  (7.18)  with  respect  to  Ug(t)  for  arbitrary  u^(t)  Is 

f 

equivalent  to  aaxlalslng  J^Uj.Ug),  where 

VVU2>  -  £  |#4„  -  *t«,  *  Ij»2.  -  1,-1  ♦  Vj> 

♦  <Vul>  *  <u2*u2  >  |  (7.19) 

Proa  the  results  of  Chapter  3  we  know  that,  whenever  the  Inverse 
of  (Z  -  Tjlj*)  exists,  the  candidate  extreoal  control  Ug#  le 


124 


(7.20) 


T*a  -  t2i2‘)  '1<*S0  -  T1«1) 
T2*D2<#io  ‘  Tl"l>’ 


Furthermore,  the  1 Inear-Gauss lan  aaeumptlone  imply  that  xq  can  ba 
generated  for  any  time  t  by  a  Kalman-Bucy  filter  using  a  prior  estimate 
of  the  initial  state,  xq,  a  prior  estimate  of  the  variance  of  the 
error  of  this  estimate,  Po,  the  noise  corrupted  measurements  s^(t)  and 
*2(t)  of  the  state  up  to  time  t  and  the  dynamic  equations 


*<t) 


F(t)x(t)  -  G1(t)u1(t)  +  G2(t)u2(t) 
+  P(t)  [^(t)  j  H2T<t>] 


(7.21) 


Vl(t)i  0 

i  j _ _ 

Kj  (t)  -H^OxU)' 

o  j  w2’l(t) 

•2(t)-l2(t)d(t) 

with 


*(to> 


x 

o 


and 


P(t)  -  F(t)P(t)  +  P(t)FT(t) 
-  »<t> 


(7.22) 


rw-x(t):  o 

±__  _ 

Hj  (t)' 

o  j  w2’\t) 

■  m 

Vt) 

P(t) 


with 


»(0  -  K 


Thu*  we  eaa  write 


u2  "  t2*D2<**  "  Vi>  (7.23) 

Substituting  Equation  (7.23)  into  Equation  (7.6)  we  obtein  as 
payoff  functional  for  player  1  at  arbitrary  tine  t  •  t 

o 

W  ‘  2  E  Mo  -  Tlul  *  T2T2*°2(*K  '  W’*K  '  Tlul 

+  T2T2*d2(*S0  -  T1u1)>+  <u1,u1>-  <T2T2*d2(« *o -  IjUj), 
»2<*{o  *  Tl°l>>  I  *1 1  <7-«> 


which  player  1  seeks  to  minimise. 

If  we  define 

P^t)  -  E  J(x  -  *i>(x  -  fy*  |  ZL  (t)  j 


where 

£j(t)  -  E  |x(t)  |  Z^(t)|  and  recalling  that  the  double  expectation' 
first  given  more  Information,  then  less  information  (information  is 
taken  away)  -  is  the  same  as  the  expectation  given  the  less  infor¬ 
mation  only  »««  (24),  then  we  can  write  JjXu^)  ** 


W 


■  2 ■  Tl“l  +  T2T2*D2(**1  '  TjOj).**!  -  Vl 


+  T2T2  D2(#x1  -  Tlul)>  +<u1,u1>  -  <T2  ^(tXj  -  TjUj), 


T2%(#*i  -  TlUl»|  +itr  (f  T#Pl) 


(7.25) 


Minimising  the  above  expression  with  respect  to  u^(t)  is  equivalent  to 
minimising  (u^) (  where 


126 


0 


I 

f.-l  • 


w 


2  l<*‘l  '  Tlul  +  T2T2*D2(**1  -T  <.  )•**!  ’  Vl 

O  O  l0 

+  T2T*D2(#x1  •  Tj^Uj)  >  +  (VUj)  “<T2T2*D2<**1  *T1u1)» 

o  o 

D2(#*l  "  Vl»|  (7*26) 

ft  ® 


Again  drawing  upon  the  results  obtained  in  Chapter  3,  we  know 
that  the  minimizing  control  for  player  1  is 


»  *  *  *  |  - 1  a 

V  -  T1  1  +  Vl  '  T2T2  #*1 


*  A 

-  D  #xl 


(7.27) 


The  dynamic  system  from  player  l's  point  of  view  can  then  be 
written  from  Equations  (7.7)  and  (7.21)  as 


It  then  follows  from  the  llneer-Geuselen  assumptions  that  the  optimal 
estimates  of  x(t)  and  $(t)  given  Z^(t),  i.e., 

$L(t)  -  E  |x(t)  |  Zj  (t)  | 

x12(t)  -  E  jx(t)  |  ZL<t)|  -  E  J  E  |  x(t>  |  Zl(t),Z2(t)|  |a1(t)j  (7.31) 

-  E  |x(t)  |Z1(t)|  -  xL(t) 


(7.32) 


with  Initial  conditions 


Vo)  -  *o 

*12(to>  “  *o 


(7.33) 


128 


while  the  error  covariance  aetrlx  see Is flee 


V 


*U  P12 


P12  P22 


I 

I 

I 

• 

■r 


V2V 


niW  i  P  +  02T2*B2*  '"lW'^VS 


*11  *12 


P12  P22 


P11  P12 


p  p 

r12  r22 


G2T2*D2* 


F  +  G2T2*D2#  - 


T 

»  ■ 

0  0 

+ 

o  n2\\i 

i  m 

4 

Pil  P12 

*i\'\  0 

•  i 

P11  P12 

P12  P22 

0  0 

P12  P22 

* 

a  m 

a  « 

(7.34) 


with 


w 


Po  0 


0  0 


(7.35) 


The  optimal  minimizing  control  for  player  1  can  thus  be  written 


•  *  A 

V  -  T1  D#*i 


(7.36) 


and  the  corresponding  optimal  reaponae  of  player  2  la  than 


u2*  *  T2*D2**  ’  T2*D2T1T1* f *1 


(7.37) 


129 


The  Kalman -Bucy  filter  (Equation  (7.21))  end  lte  corresponding 
error  covarleace  matrix  (Equation  (7.22))  can  be  simplified  by  the 
following  observations.  Rewriting  the  conditional  estimates  (Equation 
(7.32))  we  obtain 


!1  ■  *1  *  G2T2V412  *  |°1  *  V*Vl|  “1 

+  P11H1TW1  J.j  -  HjXjJ  i  x-l  (tQ)  .  xc 


(7.38) 


>12  •  *12  +  G2T2V412  -  [°1  +  C2T2*Vl|  "l 

+  P12HlTwr1  |*l  -  Hl-lJ  +  «4Vl  (*1  *  “l”l  1 

+  rH2Tw2'1*241  -  ph2tk2‘1h2512  J  x12(to)  -  xo 


(7.39) 


and  we  observe  that,  since 


Xj (t)  ■  E  |  x  |  ZL(t)J  -  xl2(t)  $  E  |e  |x  |Zj(t)Z2(t)|  |  Z^t)  (7.40) 


it  implies  that 


Pl(t)  “  Pll(t)  “  P(t)  +  Pl2(t) 


(7.41) 


and  thus 


Pu(t)  -  P(t)  +  Pl2(t) 


(7.42) 


130 


Using  Equation  (7.42)  ire  then  obtain  from  the  error  covariance 
matrix  (Equation  (7.34)) 

*12  -  "12  +  02T2*D2  *P22  +  *11*2  W  +  V’ 

+  P12  |V2V|T  -  Pl2*tW  •  P12H2  W 

nl\  h1P12  “  p12B1  H1P12  ’  P12^co^  "  0  <7-43) 

and 

p22  '  "'aWn  +  W22  +  G2T2*D2*P22  '  PBlW22 
-  n2\-\v22  *  rL2«2\-\r  *  p22pt 

+  P22  |G2T2*D2  *IT  -  P22*lW  -  WVV 

+  "i\‘‘ V  •  *nhW*n  p22(to>  -  0  <7-M> 


131 


Thus  the  Kalman-Buey  filter  and  its  error  covariance  matrix  Equations 
(7.32)  and  (7.34)  respectively,  reduce  to 


P12  *  "12  +  P12P  +  02T2  D2*P12  +  P12(G2T2  n2*  > 


*11*1  W12  -  viV'vn + *U«l  Wl2 


+  Pu«2  W2  H2Pu  -  ,uh\  HjP12  -  P12*2  W2  HjPjj 


*'lAVVll  !  P12(to)  '  ° 


(7.49) 


The  above  results  can  be  written  In  terms  of  solutions  to  matrix 

Rlccati  equations.  It  was  shown  In  Chapter  3  (Equation  (3.87))  that 

Tj*D#Xj  can  be  written  as  Gj^(t)S (t)x^ (t) ,  where  S(t)  satisfies 

Equation  (3.98).  In  Chapter  5  we  found  (Equations  (5.55)  through 

(5.61))  that  T^Dj^x  could  be  written  as  Gj*(t)N^  (t)x(t) .  By  a 

*  .  A 

completely  parallel  argument  we  can  show  that  we  can  write  T2  x  as 
G2T(t)N2(t)$(t),  where  N2(t)  satisfies 


132 


H2(t)  -  -  N2F(t)  -  (t)N2  +  H2G2(t)G2(t)N2  ;  N2(T)  -  I  (7.50) 

If  we  further  define 

R2(t)  ^*T(T,t)D2(t)R2 j(t)D(t)  *(1,1)  (7.51) 

where 

R21(t)^TlTi*  (7*52) 

then  on  taking  the  derivative  of  Rj(t)  with  reapect  to  t  we  obtain  by 
using 

fT(T,t)  -  -FT(t)  #T (T , t ) 

(7.53) 

#(T,t)  -  -  «  (T,t)F(t) 

R2  -  -  ft*td2r21d#  -  *td2*g2g2t#td2r21d* 

*  ♦TD2#GiGiTtTDt  +*D2R2lD*GlClT*TD* 

-  ♦TD2R21D*G2G2T*TD#  -  ♦^RjjDfF  (7.54) 


Substituting  Equation  (7.51)  and  the  defining  equations  for  S(t)  and 
N2(t),  l.e. , 


S(t)  £  #T(T,t)D(t)  •  (T,t) 
M2(t)  ^  #T(T,t)D2(t)  #(T,t) 


(7.55) 


133 


Che  resulting  equation  la 


Ej  -  -  R2F(t)  -  FT(t)R2  -  N2G2(t)G2T(t)R2  - 

+  R,  |Gl(t)GlT(t)  -  G2(t)G2T(t)]  S  ;  RjCO  -  0  (7.56) 

From  Equations  (7.36),  (7.47),  (7.48)  end  (7.49)  the  optimal 
delayed  commitment  strategy  for  player  1  Is  then  given  by  the  follow¬ 
ing  set  of  equations 

V(t)  -  G1T(t)S(t)^1(t)  (7.57) 

(7.58) 

S  -  -  SF(t)  -  FT(t)S  +  S  |6l(t)GlT(t)  -  G2(t)G2T(t)  ]  S  ;  S  (T)  -  I 

(7.59) 

*t(t)  -  |F(t)  -  Gl(t)G1T(t)S  +G2(t)G2T(t)R2(t)  -  G2(t)G2T(t)R2(t)  J  ^(t) 
+  PuHlT(t)W1'l(t)  |  (t)  -  H^ttf^t)]  ;  $  (t#)  -  *o 

Pu  -  F(t)Pu  +  Pu?T(t)  +  G2(t)G2TR2(t)P12  +  Pl2N2(t)G2(t)G2T(t) 

-  PuHlT(t)W1'l(t)Hl(t)Pu  ;  Pu(t0)  -  P0  (7.60) 

P12  -  F(t)Pl2  +  Pl2?T(t)  +  G2(t)G2T(t)H2(t)Pl2+P12N2(t)G2(t)G2T(t) 

-  PllH1THl‘l(t)Hl(t)Pl2  -  Pl2H1T(t)Wl'l(t)Hl(t)Pu 

*  Pl2HiT(t)Wl"l(t)Hl(t)P12  +  FuH2T(t)W2'l(t)H2(t)P11 


134 


+  Pl2H2*(t)H2  ‘(OHjtOP^  ;  Pl2(tQ)  -  0 


(7.61) 


M,  -  *  N,F(t)  -  F*(t)H2  +  M2C2(t)C2T(t)N2  ;  H,  (T)  ■  I  (7J 


R2  "  ’  R2F(t>  ‘  *  ^*2  ’  H2G2(t)G2  (t)R2  -  NjG^  (t)G^T(t)S 

**2  [GjCOGjV)  -  G2(t)G2T(t)]  S  ;  R2(T)  -  0  (7.63) 


Note  that  the  above  matrix  Rlccatl  type  equations  do  not  present  a  two 
point  boundary  value  problem  but  can  all  be  solved  using  either  for¬ 
ward  -  or  backward  integration.  This  solution  can  take  place  "on¬ 
line"  with  a  digital  computer  during  the  actual  game. 


If  we  now  consider  the  game  from  the  point  of  view  of  the 
maximising  player,  player  2,  his  performance  criterion  during  the 


game  at  time  t  Is 


(7.66 


* 

J2(U1*V  "  2  R  *T(T)*(T)+  /  |«lT(»)ul(T)  -u2T(t)u2(T)  ]  dT  I  Z2(t) 


and  his  secure  strategy  can  be  determined  by  finding  the  saddle-point 
solution  to  this  equation  subject  to 


*  -  P(t)x (t)  -  C.  (t)u.  (t)  ♦  G,(t)u,(t)  ;  *(t  )  -  « 


(7.65 


To  determine  pleyer  2 '•  secure  etretegy  solution  we  assume  thet  the 

allowable  strategy  for  player  1,  In  addition  to  being  Z^(t)  measurable, 

Is  also  Z2 (t)  measurable  and  ve  seek  thet  U2  c  Uj  and  u^  €  x 

which  are  optimal  In  the  sense  that  for  all  t  t  ,T 

o 

if  *  *  it 

J2*U1  *u2^  ^  J2^U1  ,U2  *  *  j2(u1,U2  *  (7.66) 

By  a  completely  parallel  argument  as  used  for  the  solution  of 
the  game  from  player  l's  point  of  view,  Equation  (7.23),  and  replacing 
max.  by  min.  and  player  1  by  player  2,  we  obtain  as  the  candidate 
extremal  control  for  player  1 

U1  “  W**  +  W*  (7.67) 

with  the  Kalman-Bucy  filter  given  by  Equations  (7.21)  and  (7.22). 

Substituting  Equation  (7.67)  Into  Equation  (7.64)  we  obtain  as 
payoff  functional  for  player  2  at  arbitrary  time  t  ■  tQ. 

J2(u2>  ■  2  E  <*^o  *  TlTi*Dl(f  *0  +  T2u2*  +  T2u2*  **o 
-  TjT^*D^(#$q  +  T2u2>  +  T2u2> 

+  <T1Tl\(#V  T2u2),  Dl#  V  T2u2> 

+  <u2*u2>I  2j(t)  I  (7.68) 


136 


137 


(7.75) 


with  the  corresponding  error  covariance  matrix 


Pil  *21 


p  P 
r21  r22 


"•iVS 


-  GiTi  »i* 


P  ‘  G1T1*D1*  "  niVl\  '  ™2\~1h2 


P11  P2l  P11  P21 


*21  *22  ■  *21  *22  *HlT*,rlHl 


*  °1T1  Dl* 


T  0 


F  -  Vi*0!*  ‘  ‘ 


PU  P21 1  f  WS  °  PU  P21 


P21  P22 


0  PHl1Wl“lHlP 


°  P21  *22 


(7.76) 


with 


P2<to)  " 


*o  0 


0  0 


(7.77) 


The  optlaal  control  for  player  2  la  thus 


*  *  A 

u2  -*2D*X2 


(7.78) 


139 


Using  Equation  (7.84)  ve  obtain  from  Equation  (7.76) 

'  n;i  -  ViVp22  +  pu*iVV  * 

'  p2i[ciTiVF  -  P2.«.V\P  -  VjW 

-  Mj’Vj  *2P21  ”  P21H2  W2  H2P21  ;  P21  "  0 


(7.85) 


P22  ‘  «,W*1  +  ”22  '  °1T1*D1»P22  *  ™iV1h1P2 


-  ra2’W2'lHJP22  -  P2iaiTHr1“lP  +  P22P’ 


-  P22  |  G1T1*D1*)T  -  P22«lVV  •  P22H2%‘‘V 
+  W1 W  *  VV21  !  W*( 


(7.86) 


which  simplifies  to 

P22  -  "22  •  °1T1*D1*P22  +  PllHl  W  +  P22pI 

'  P22  |  VlVP  *  P22H1Twi'1h1P  '  ^VV 

+  P*1’Vj-lH1?2I  -  P21H2IW2'lH2P21  !  PJ2(t0)  -  0  0.87) 

Then  comparing  Equation  (7.85)  with  Equation  (7.87)  we  observe  that 

P21(t>  "  P22(t)  (7,88) 


141 


■ad  ve  c*a  writ*  for  Equation  (7.80)  and  (7.76) 

■ 

$2  -  <p  *  ciT!*di * )*2  +  (C2  ‘  G1T1*D1T2)u2 
+  l(*2  "  H2X2J  5  X2(to>  “  *o 

and 

'll  ‘  "ll  +  'll'1  -  ®1T1*D1*P21  *  P21  <VlV)T 

-  ‘iAVVu  !  pu(to>  •  P. 


"21  +  ?21pI  ■  °lTlVP2l  •  p*.<VlV* 

+  'iaS'Vii  -  -  wWii 

♦  p:AVVa  '  *n*2\'\hi  -  WVVu 

+  P2l"aW2l  ■  P21(to>-° 


If  we  define 

Rt(t)  .  #T(T,t)Dl(t)Rl2(t)o(t)#Cr.t) 

where 

R12(t)  *  T2T2 

then  using 

•  T(t,t)  -  -  FT(t)  #T(I»t) 

.  T  T 

•  (T,t)  •  •  (T.t)F(t) 


(7.89) 


(7.90) 


(7.91) 


(7.92) 


(7.93) 


142 


ve  obtain  after  taking  the  derivative  of  R^(t)  with  reapect  to  t 
Rx  -  -  FT  #TD1Rl2D*  +  **0^  C^t^R^D# 

-  #TDx  ♦  GjGj1  +  ♦Td1r12D*gigiT 

-  i2D#G2G2T*TD*  -*V12D*F  (7.94) 


Substituting  Equation  (7.92)  and  the  defining  Equations  for  S(t)  and 
Nr(t),  -l.e. , 


S(t)  £  *T(T,t)D(t)  *  (T,t) 
Nt(t)  £  *T(T,t)D1(t)*  (T,t) 


(7.95) 


ve  obtain 

Rj  -  -  RLF(t)  -  FT(t)R1  +  NlGl(t)GlT(t)Rl  -  N^OGj^OS 

+  RL  [  Gl(t)G1T(t)  -  G2(t)G2T(t)  |  S  ;  R^T)  -  0  (7.96) 

Using  Equations  (7.95)  and  (7.96)  In  Equations  (7.78),  (7.86), 
(7.87)  and  (7.89)  the  optimal  delayed  commitment  strategy  for  player  2 
Is  then  given  by  the  following  set  of  equations. 

u2*(t)  -  G2T(t)S(t)$2(t)  (7.97) 

(7.98) 

S  -  -  SF(t)  -  FT(t)S  +  S  |  Gt  (t)G1T(t)  -  G2(t)G2T(t)]  S  ;  S  (T)  -  I 
x2(t)  -  | F (t )  -  Gl(t)G1T(t)Hl(t)+G2(t)G2T(t)S(t) -Gl(t)GlT(t)Rl(t)p2(t) 

+  FuH2T(t)W2'l(t)  [  e2(t)  -  H2(t)x2(t)  |  ;  «2(tD>  «^0  (7.99) 


143 


(7.100) 


Pu-  F(t)Pu+Plirr(t)-Cl(t)ClT(t)M1(t)P2l-P21H1(t)G1(t)C1T(t) 

-  PuH2T(t)W2'l(t)H2(t)Pu  ;  ?u(to>  -  Po 

P2l  -  P(t)P2l+P2lFT(t)-G1(t)G1T(t)N1(t)P21-P21Kl(t>Gl (00^(0 
+  PuH1T(t)Wl'l(t)Hl(t)Pu  -  PuH1T(t)W1'l(t)Hl<t)P2l 

-  P2lH1T(t)W1"l(t)H1(t)Pn  +P21H1T(t)W1'1(t)H1(t)P21 

-  PuH2T(t)W2'l(t)H2(t)P2l  -  P2lH2T(t)W2'1(t)H2(t)Pu 

+  P21H2T(t)W21(t)H2(t)P21  ;  *21**0*  "  0  (7.101) 

-  -  N^F(t)  -  PT (t)N^  +  XIG1(t)G1T(t)Hl  ;  ^(T)  -  I  (7.102) 

Rj_  -  -  RjF(t)  -  FT(t)R1  +H1G1(t)GlT(t)R1  -  Vfr  (t)G2T(t)S 

+  Rl  [  Gl (t)G1T(t)  -  G2(t)G2T(t)j  S  ;  RjCf)  -  0  (7.103) 

The  above  solutions  for  player  2  ara  very  similar  to  those 
obtained  for  pleyer  1  and  are  "simple"  in  that  they  can  be  directly 
solved  using  forward  and  backward  integration  with  a  digital  computer. 

Recalling  that  tfillman  [  8  J  showed  that  for  the  class  of  games 
discussed  in  this  chapter, the  strategies  could  only  be  realised  with 
infinite  dimensional  dynamic  systems,  we  observe  that  the  point  of 
view  of  delayed  commitment  strategies  lasds  to  solutions  which  are 
readily  computable. 


144 


CHAPTER  8 


SUMMARY,  CONCLUSIONS,  AND 
SUGGESTIONS  FOR  FUTURE  WORK 

In  this  dissertation  the  problem  of  prior  end  delayed  commitment 
strategies  to  differential  games  with  noise  corrupted  state  measure¬ 
ments  is  discussed.  It  is  pointed  out  that  the  prior  commitment 
solution,  which  has  led  previous  researchers  to  define  the  closure 
problem,  is  valid  only  under  restricted  circumstances. 

The  delayed  commitment  solutions  are  then  obtained  for  a  differ¬ 
ential  game  where  one  player  has  perfect  state  Information  and  the 
other  player  has  only  noise  corrupted  measurements  of  the  state  and  Is 
extended  to  a  differential  game  where  both  players  have  noise  corrupted 
measurements  in  Chapter  7.  In  both  cases,  the  resulting  secure 
strategies  do  again  satisfy  the  familiar  Separation  Theorem  of 
stochastic  optimal  control. 

Of  particular  significance  Is  the  fact  that  the  governing 
equations  do  not  result  In  an  often  difficult  to  solve  non-linear  two- 
point  boundary  value  problem,  but  are  readily  computable  with  a  digital 
computer . 

A  detailed  example  of  a  pursuit -evasion  game  Is  presented  In 
Chapter  6.  It  discusses  a  missile  and  an  airplane  system  where  the 
missile  (or  player  1)  has  perfect  state  measurements  and  the  airplane 
(or  player  2)  has  noise  corrupted  measurements.  Both  the  prior  commit¬ 
ment  and  delayed  commitment  solutions  have  been  obtained  and  the  results 
compared. 


145 


An  immediate  and  direct  extension  of  the  research  presented  In 
this  dissertation  Is  to  extend  the  results  to  differential  gases,  where 
in  addition  to  noise  corrupting  the  measurements,  additive  white 
Gaussian  noise,  independent  of  the  measurement  noise  and  of  the  initial 
estimate  of  the  state,  is  present  in  the  system  dynamics.  Of  course, 
if  the  noises  are  not  white  but  Markov  with  rational  spectra,  they  can 
be  modelled  as  outputs  of  a  dynamic  system  which  is  driven  by  white 
noise  and  by  adjoining  this  dynamic  model  to  the  system  equations  an 
augmented  system  is  obtained  with  white  noise  disturbances. 

from  the  game  theoretic  point  of  view  the  real iset ion  that  the 
aero-sum  assumption  has  to  be  abandoned  during  the  actual  stochastic 
gsme  offers  several  interesting  analytic  and  conceptual  concepts  not 
found  in  aero-sum  differential  games.  We  have  used  the  mlnimax 
solution  concept,  however,  non-inferior  (or  Pareto  optimal)  strategies 
or  solution  concepts  involving  coalitions,  bargaining,  etc.,  can  be 
envisioned. 


146 


REFERENCES 


1.  von  Neumann,  John >  and  0.  Morgenstern.  Theory  of  Game*  and 
Economic  Behavior.  Princeton  University  Preae,  Princeton, 

New  Jersey,  1943. 

2.  Isaacs,  R.  P.  "Differential  Games  -  I;  Introduction,"  RAND 
Corporation,  Research  Memorandum,  RM~1391,  November  1954. 

3.  Isaacs,  R.  P.  "Differential  Games  -  II:  The  Definition  and 
Formulation,"  RAND  Corporation,  Research  Memorandum,  RM-1399, 
November  1954. 

4.  Isaacs,  R.  P.  "Differential  Games  -  III:  The  Basic  Principles 
of  the  Solution  Process,"  RAND  Corporation,  Research 
Memorandum,  RM-1411,  December  1954. 

5.  Isaacs,  R.  P.  "Differential  Games  -  IV:  Mainly  Examples," 
RAND  Corporation,  Research  Memorandum,  RM  1468,  March  1955. 

6.  Isaacs,  Rufus  P.  Differential  Games.  J.  Wiley  and  Sons,  Inc., 
New  York,  1965. 

7.  Aumann,  R.  J.  and  M.  Maschler.  "Some  Thoughts  on  the  Mlnlmax 
Principle,"  Management  Science.  Vol.  18,  No.  5,  Pert  2, 

pp.  54-63,  January  1972. 

8.  Wi liman,  W.  W.  "Formal  Solutions  for  a  Class  of  Stochastic 
Pursuit-Evasion  Genies,"  IEEE  Trans,  on  Automatic  Control. 

Vol.  AC-14,  No.  5,  pp.  504-509,  October  1969. 

9.  Behn,  R.  D. ,  and  Y.  C.  Ho.  "On  a  Claaa  of  Linear  Stochastic 
Differential  Games,"  IEEE  Trans,  on  Automatic  Control. 

Vol.  AC-13,  No.  3,  pp.  227-240,  June  1968. 

10.  Rhodes,  I.  B. ,  and  D.  G.  Luenberger.  "Differential  Games 
with  Imperfect  State  Information,"  IEEE  Trans,  on  Automatic 
Control.  Vol.  AC-14,  No.  1,  pp.  29-38,  February  1969. 

11.  Kuhn,  H.  W.  "Extensive  Games,"  Proc.  Nat.  Acad.  Scl..  Vol.  36, 
pp.  570-576,  October  1950. 

12.  Luce,  Robert  D.  and  H.  Ralffa.  Games  and  Decisions,  J.  Wiley 
and  Sons,  Inc.,  New  York,  1957. 

13.  Starr,  A.  W.  "Nonsero-Sum  Differential  Games:  Concepts  and 
Models,"  Harvard  University,  Division  of  Engineering  and 
Applied  Physics,  Technical  Report  No,  590,  June  1969. 


147 


14 


Ho,  Y.  C.,  A.  E.  Bryson,  and  S.  Baron.  "Differential  Games 
and  Optimal  Purault-Evaaion  Strategies,"  IEEE  Trana.  on 
Automatic  Control.  Vol.  AC* 10,  pp.  385*389,  October  1965. 

15.  Berkovits,  L.  0.  "Variational  Approach  to  Differential  Games," 
Advancea  In  Game  Theory.  (Annals  of  Matheaatlce  Studies,  52), 
Princeton  University  Press,  Princeton,  New  Jersey,  1964, 

pp.  127-174. 

16.  Porter,  W.  A.  "On  Function  Space  Pursuit-Evasion  Games," 

SIAM  J.  Control.  Vol.  5,  No.  4,  pp.  555-574,  April  1967. 

17.  Rhodes,  I.  B. ,  and  D.  G.  Luenberger.  "Stochastic  Differential 
Games  with  Constrained  State  Estimators,"  IEEE  Trans .  on 
Automatic  Control.  Vol.  AC-14,  No.  5,  pp.  476-481,  October  1969 

18.  Haxsanyi,  J.  C.  "Games  vlth-  Incomplete  Information  Played  by 
Bayesian  Players  -  Part  I,  The  Basic  Model,"  Management  Science 
Vol.  14,  No.  3,  pp.  159-182,  November  1967. 

19.  Harsanyi,  J.  C.  "Games  with  Incomplete  Information  Played  by 
Bayesian  Players  -  Part  II,  Bayesian  Equilibrium  Points," 
Management  Science.  Vol.  14,  No.  5,  pp.  320-334,  January  1968. 

20.  Harsanyi,  J.  C.  "Games  with  Incomplete  Information  Played  by 
Bayesian  Players  -  Part  III,  The  Basic  Probability  Distribution 
of  the  Game,"  Management  Science,  Vol.  14,  No.  7,  pp.  486-502, 

21.  Ho,  Y.  C.  "On  the  Minimax  Principle  and  Zero  Sum  Stochastic 
Differential  Games,"  Proceedings  of  the  1972  IEEE  Conf.  on 
Decision  and  Control  snd  11th  Symposium  on  Adaptive  Processes. 
New  Orleans,  Loulslsna,  December  1972,  IEEE,  New  York,  1972, 
pp.  333-339. 

22.  Peterson,  Edwin  L.  Statistical  Analysis  and  Optimisation  of 
Systems .  J.  Wiley  and  Sons,  Inc.,  New  York,  1961. 

23.  Bryson,  Jr.,  Arthur  E.  and  Y.  C.  Ho.  Applied  Opt lam  1  Control. 
Ginn  and  Co.,  Waltham,  Massachusetts,  1969. 

24.  Kushner,  Harold  J.  Introduction  to  Stochastic  Control.  Holt, 
Rinehart  and  Winston,  Inc.,  New  York,  1971. 


148 


APPENDIX  A 


-COMPUTER  PROGRAM  LISTING  FOR  THE  NUMERICAL  EXAMPLE  OF  SECTION  6.4 


-  ouasiliniarization  iteration  1 

SUBROUTINE  FUNEV 

COMMON  TIME*DELT,NSTART#NFIRST,NEXIT, IPASS,ROMCON<2094> 

REAL  K1.KZ.KT1.KT2.N.N0 
REAL  NO.NPiD'NPl.NHlD.NHl 

DATA  TF.AS.KTi.KT2.TAU! *  TAU2.R1 1 S»  P22S/1 0. *.04*32. 2. 32. 2.1.*?.* .00 
11*. 004/ 

IF(NSTART>30*50*10 
10  READ(5.20)NPl.PPl.rf2.NHl,PHl 
20  FORMAT (4E20.0) 

CALL  INTG(NPID.NPI) 

CALL  1NTG (PP10.PP1 ) 

CALL  INTG (NH1D.NH1 ) 

CALL  INTO (PH ID .PHI ) 


CALL 

PRINT ( lOH 

S<T) 

« 10H.G12.4 

«S  *  1 «  0 . ) 

CALL 

PRINT (10H 

NP1D 

. 10H.G12.4 

. NP1D.3.0. ) 

CALL 

PRINT (1  OH 

NP1 

* 10H.G12.4 

. NP 1*1.0.) 

call 

PRINT ( 1  OH 

PPID 

* l 0H.G12.4 

.PPID. 3.0* ) 

CALL 

PRINT ( 1  OH 

PPl 

* 10H.G12.4 

•PP1.1.0.) 

CALL 

PRINT  < IQH 

NM1D 

.10H.G12.4 

*NH 10*3*0.) 

CALL 

PRINT  C10H 

NH1 

* 10H.G12.4 

•NH1.1.0.) 

CALL 

PRINT ( 1  OH 

PHID 

. 1 0H.G12.4 

.PH 10*3*0.) 

CALL 

PRINT ( 1  OH 

PHI 

. 10H.G12.4 

.PHI. 1*0.) 

30 

CALL 

PRINT (1  OH 

Kl  (T) 

* 10H.G12.4 

*K1 *5*0. > 

50 

CALL  PRINT ( l OH 

RETURN 

TG0=TF-T1ME 

K2(T) 

•10H.G12.4 

.K2.5.0 . ) 

T 1* 1 .-EXP (-TGO/TAU1 ) -TGO/TAUi 
T2*l .-EXP (-TGO/TAU2) -TG0/TAU2 

Ss6.*RUS*R22S/(6.*RllS*R22S»AS*KTl**2*R22S*<6.*TAUl**2*TG0-6.*TAU 
1  l*TG0**2*2.*TG0**3O.*TAUl**3*(l.-EXP(-2.*TG0/TAUl)  )-12.*TAUl**2*T 
?G0*FXP ( -TGO/TAUI ) )-AS*KT2**?«Rl IS* C 6. *TAU2**2*TG0-6.*TAU2»TG0**2*2 
3.*TGO**J*3.*TAU?**3*<l.-EXP<-?.*TGO/TAU?> ) -I2.*TAU2**?*TG0*EXP<-TG 
40/TAU?) » I 

K 1 *AS*K  T1**2*TAU1 **2*T1 **2/Rl 1 S 
K?sAS*KT2**2*TAU2**2*T2**2/R22S 
P0=1000.*EXP<-.S*TIME> 

NO*-. 000000015 

NPlD*2.*(Kl*N0*Ki*S*P0/W2) *NPI*2.*NO*PP1/M2-K1*NO*NO-2.*NO*PO/M2»K 
I2*S*S 

PP1D*-2.*K1*PO*NPI-2.*|K1*NO»KI*S*PO/W2)*PP1*PO*PO/W2*2.*K!*NO*PO 
NM1D*2.*<K1*N0*K1*S*P0/N2)*NH1*2.*N0*PH1/W2 
PHlD*-2.*Kl*P0*NHl-2.*(Kl*N0tKl*S*P0/M2»*PHl 
IF (NF I RST  >60.80*60  . 


149 


60  Als-NPl/NHl 

«RITE(6*70)NPl*pPi*W2«Al 

70  FORMAT ( 1H0*4X*4HN  *  G14.6*5X,4HP  *  G14.6*5X*5HW2  *  G14.6.5HA1 

1*G2?.14> 

AO  RETURN 
ENO 


-  ITERATION  2 
SUBROUTINE  FUNEV 

COMMON  T I ME  *DELT. NST ART  *  NF I RST • NE  X I T * I PASS • ROMCON ( 2094 ) 

REAL  K1*K2*KT1.KT?*N«ND 

REAL  N0*NP1*NP1D*NH1  •NH1D«NP2*NP20«NH2,MH20 

DATA  TF*AS*KT1,KT2*TAU1*TA1J2*R11S.R22S/10.*.04*32.2,32.2*1.*2.,.00 
11*. 004/ 

IF(NSTART>30*50*10 

10  rEAD<5.20)NP1*PP1*N2*NP2*PP2*NH2*PH2 
20  FORMAT (4E20.0) 

CALL  iNTG(NPlDtNPl) 

CALL  INTG(PP10*PP1) 

CALL  INTG (NP2D*NP2) 

CALL  INTGIPP20.PP2) 

CALL  1 NTG ( NH2D  *  NH2 ) 

CALL  INTG<PH2D*PH2) 


CALL 

PRINT ( 1  OH 

SIT) 

• 10H.G12.4 

*S* 1 *0. ) 

CALL 

PRINT ( 1  OH 

NP10 

«10H«G12.4 

•NP 10*3*0.) 

call 

PRINT ( 1  OH 

NP1 

•I0H.G12.4 

*  NP 1 • 1 • 0 • ) 

call 

PRINT ( 1  OH 

PP10 

.10H.G12.4 

*PP1D*3*0.) 

CALL 

PRINT ( 1  OH 

PP1 

• 10H* G12.4 

♦PP1*1*0.) 

CALL 

PRINT ( 1 OH 

NP2D 

•10H.G12.4 

*NP2D*3*0.) 

call 

PRINT ( 1  OH 

NP2 

.10H.G12.4 

*NP2*1*0.) 

call 

PRINT ( 1  OH 

PP20 

•10H.G12.4 

*PP20*3*0.) 

CALL 

PRINT (1 OH 

PP2 

♦10H.G12.4 

*PP2. 1*0.) 

CALL 

PRINT ( 1  OH 

NH2D 

•10H.GI2.4 

•  NH20  «  3  «  0 • ) 

CALL 

PRINT ( 10H 

NH2 

« 1 OHf G12.4 

•  NH2 • 1 • 0 •  ) 

call 

PRINT ( 1  OH 

PH20 

•10H.G12.4 

♦PH20. 3*0* ) 

call 

PRINT ( 10H 

PH2 

* 10H*G12«4 

«PH2, 1*0.) 

30  CALL 

PRINT ( 1 OH 

Kl  IT) 

•10H.G12.4 

•Kl *5*0. ) 

call 

PRINT ( 10H 

K2(TJ 

•10H.G12.4 

*K2*5*0 • ) 

RETURN 

50  TG0*TF-TIME 

Tl*l.-EXP(-TGO/TAUn-TGO/TAUl 

T2*1.-EXP(-TGO/TAU2)-TGO/TAU2 

Ss6.*RllS*R22S/(6.*RllS«R22S*AS*KTl**2*R22S*(6.*TAUl**2*TGO-6.*TAU 
ll*TG0**2*2.*TG0«*3O.*TAUl**3*(l.-FXP(-2.*TG0/TAUl>  >-12.*TAUl««2«T 
2GO*F.XP(-TGO/TAUl )  ) -AS*KT2**2*Rl  IS*  (4.*TAU2**2«TG0-6.*TAU2*TG0«*2*2 
3.*TG0**3*3.*TAU2**3«U.-EXP<-2.*TG0/TAU2)»-12.*TAU2**2*TG0*fcXP(-TG 
40/TAU2))) 

K1*AS*KT1*»2*TAU1**2«TI«*2/RI IS 
A2*AS*KT2**?*TAU2**2*T?**2/R22S 
NO*- .0000 000  lb 


150 


P0*1000.*EXP<-.5*TIME> 

NPlO*2**<Kl«N0*Kl*S^P0/W2)*NPl*2.*N0*PPl/W2-Kl*N0*N0-2.*N0*P0/W2*K 

12*S*S 

PP1D=-2.*KI*P0*NP1-2.*<K1*N0*K1*S»P0/W2)*PP1*P0*P0/W2*2**K1*N0*P0 

NP2D*2.*(K1*NP1*K1#S*PP1/W2)*NP2*2.*NP1*PP2/W2-K1*NPI*NPI-2.*NP1*P 

IP1/W2*K2*S*S 

PP2D*-2.*K1*PP1*NP2-2.*<K1*NP1*K1*S^PP1/W2)*PP2*PP1«PP1/W2*2.*K1*N 

1P1*PP1 

NH2D*2.*(K1*NP1*K1*S*PP1/W2)*NH2*2.*NPI*PH2/W2 
PH2D*-2.*Ki*PPl«NH2-2.*(Kl*NPl*Kl*S*PPl/W2)*PH2 
1F(NFIRST>60, 80*60 
60  A2*-NP2/NH2 

WRITE (6*70) NP1*PP1*W2*A2 

70  FORMAT <1H0*4X*4HN  *  G14.6*5X*4HP  =  G14.6*5X,SHW2  *  G14.6,5HA2 

1-G22.I4) 

80  RETURN 
END 


-  ITERATION  3 
SUBROUTINE  FuNEV 

COMMON  TIME* DECT tNSTART *NFIRST*NEX IT* 1 PASS »ROMCON (2094) 

REAL  AltK2*KTl*KT2*N*ND 

REAL  NO *NPl *NP10*NM1 *NH1D*NP2«  NP20*  NH2,NH20 
REAL  NP3*NH3*NP30*NM30 

OATA  TF , AS«KT1 «KT2*  TAU1 *  TAU2*R1 1S.R22S/1 0*  **04*32*2*32*2«1**2*«*00 
11*. 004/ 

IF  (NSTART) 30*50*10 

10  REAO(S*20)NP1*PP1*W2*NP2«PP2«NP3*PP3*NH3*PH3 
20  FORMAT (4E20*0) 


CALL 

I NTG ( NP 1 0  «  NP 1 1 

call 

1NTG(PP10*PP1) 

CALL 

INTG |NP?D*NP2) 

CALL 

INTG(PP20*PP2I 

CALL 

INTGCNP3D*NP3) 

CALL 

INTG (PP30*PP3) 

CALL 

INTG(NM3D*NH3) 

CALL 

INTG (PM30*PH3J 

CALL 

PRINT (10H 

S  (T) 

* 10H*G12*4 

*S* 1 ,0* ) 

CALL 

PRINT ( 1  Oh 

NP10 

*  1 0H*G1 2*4 

•NP 10*3*0*) 

CALL 

PRINT U  OH 

NP1 

*  1 0H,G1 ?*4 

•NPl « 1 *0* ) 

CALL 

PRINT ( I OM 

PP10 

.  10H*G12.4 

*RP10*3,0.) 

CALL 

PRINT  UOH 

PP1 

* 10H, 612*4 

*PP1,1*0.) 

CALL 

PRINTCIOM 

NP2D 

«10H*G12*4 

«NP20«3*0* ) 

CALL 

PRINT (I  OH 

NP2 

•  1  OH, 012*4 

,NP2,1,0.) 

CALL 

PRINT ( 1  OH 

PP20 

«10H,G12*4 

,PP20*3*0.) 

CALL 

PRINT UOH 

PP2 

•10H.G12.4 

*PP2*1*0.) 

CALL 

PRINT  UOH 

NP  30 

« 10H«G12*4 

*NP 30*3*0* ) 

CALL 

PRINT  U  OH 

NP3 

, 10H*G12*4 

•NP3* 1 ,0* ) 

call 

PRINT  UOH 

PP30 

» 10H*G12*4 

«PP30«3*0* ) 

CALL 

PRINT  UOH 

PP3 

• 10H*G1?*4 

*PP3* 1 *0, ) 

151 


CALL  PRlfc-'UOH  NH3D* 10H*G12.4  ,NH3D*3.0*> 

CALL  PRINT ( 10H  NH3  .10H.G12.4  *NH3*1*0.) 

CALL  PRINT ( 10H  PH30* 10H.G12.4  .PH3D.3.0*) 

CALL  PRINT ( 10H  PH3  *10H,G12.4  »PH3*1*0.) 

30  CALL  PRINT ( 10H  K1 <T) « 10H,G12.4  ,KT1*5*0.> 

CALL  PRINT ( 10H  K2(T) *10H*G12*4  *KT2*5*0.) 

RETURN 

50  TGO*TF-TIME 

T1*1.-EXP«-TG0/TAUI)-TG0/TAU1 
T2*l .-EXP I-TG0/TAU2) -TG0/TAU2 

S*6.*R11S*R22S/(6.*R11S*R22S*AS*KT1**2*R22S*(6.*TAU1**2*TG0-6.*TAU 
ll*TG0**2*2.*TG0**3O.»TAUl**3*(l.-EXP<-2.*TG0/TAUl>  >-12.*TAUl**2*T 
2G0*F.XP  (-TG0/TAU1 ) )  -AS*KT2**2*R1  IS*  (6.*TAU2**2*TG0-6.*TAU2*TG0**2*2 
3,«TG0**3^3.«TAU2**3*a.-EXPl-2.*TG0/TAU2))-12.»TAU2**2*TG0*EXP(-TG 
40/TAU2) ) ) 

K1=AS*KT1**2*TAUI**2*T1«*2/P11S 
K2=AS*KT2**2*TAU2**2*T2**2/R22S 
NO*-* 0000000 15 
P0=1000.*EXP(-.5*TIME) 

NP10®2«* (Kl*N0»Kl*S*P0/N2) #NP1 ♦2«*N0*PP1/N2-K1*N0*N0-2»*N0*P0/N2*K 
12*S*S 

PPlD*-2.*Kl*PO*NPl-2.*(Kl*NO*Kl*S*PO/W2>*PPl*PO*PO/W2*2.*Kl*NO*PO 

NP2D=2.*(K1*NP1*K1#S*PP1/#2)*NP2*2**NP1*PP2/W2-K1*NP1*NP1-2.*NP1*P 

1P1/W2^K2#S*S 

PP2D=-2.*K1*PP1*NP2-2.*<K1*NP1*K1*S*RP1/N2>*PP2*PP1*PP1/W2*2.*K1*N 

1P1»PP1 

NP3D=2 . *  I K 1 *NP2*K 1 *S*PP2/N2> *NP3* 2 . *NP2*PP3/W2-K 1 *NP2* NP2-2 . *NP2*P 
1P2/W2*K2*S*S 

PP30=-2.*Kl*PP2*NP3«2.*(Kl*NP2*Kl*S*PP2/W2>*PP3*PP2*PP2/tf2*2.*Kl*N 

1P2*PP2 

NH30*2.*<K1*NP2*K1*S*PP2/N2>*NH3*2.*NP2*PH3/W2 

PH3D=-2**K1*PP2*NH3-2,*(K1*NP2*K1«S«PP2/N2>*PH3 

IF(NFIRST)60.80*60 

60  A3=-NP3/NH3 

MRITE(6*70)NP1*NP2«N2*A3 

70  FORMAT ( 1M0*4X«4HN  *  Gl4.6«5X*4HP  *  G14.6*5X« 5HM2  *  G14.6.SHA3 
1 =022. 14) 

BO  RETURN 
END 


-  ITERATION  4 
SUBROUTINE  FUNEV 

COMMON  TIME*OELT,NSTART*NFIPST*NEXIT*IPASS*ROMCON(2094) 

REAL  Kl*K2*KTl*KT2»N*N0 

REAL  NO  *  NP 1  *  NP 1 0  *  NH 1 • NH 1 0 • NP2 • NP20  «  NH2 *  NH20 
REAL  NP3 • NH3 • NP30 ♦ NH30 
REAL  NP4*NH4«NP40*NH40 

DATA  TF«AS*KT1*KT2*TAU1*TAU?*R115*R??S/I0.«.04*32.2*32.2*U*2.*.00 
11*. 004/ 

IF INSTART) 30*50*10 


152 


10  READ (5*20) NPl*PPl*«2*NP2*PP2tNP3*PP3»NP4fPP4.NH4*PH4 
20  FORMAT (4E20.0) 


CALL 

iNTG(NPlDtNPl) 

CALL 

INTG(PP10*PP1) 

CALL 

INTG(NP2D»NP2) 

CALL 

INTG (PP2D*PP2) 

CALL 

1NTG(NP3D*NP3) 

CALL 

INTG(PP30*PP3) 

CALL 

INTG(NP4D«NP4) 

CALL 

INTG(PP4D*PP4) 

CALL 

I NTG ( PH4D • PH4 J 

CALL 

I NTG ( NH4D  *  NH4 ) 

CALL 

PRINT ( 10H 

S(T1 

•I0H,G12.4 

«S* 1  *  0 • 1 

CALL 

PRINT ( 10H 

NP1D 

•10H,GI2.4 

*NP1D«3*0.) 

call 

PRINT (10H 

NP1 

•10H,G12.4 

•  NP10*0.) 

call 

PRINT (10H 

PP10 

»10H,G12.4 

♦PP1D*3*0* ) 

CALL 

PRINTOOH 

PPI 

« 10H«G12»4 

♦PP1*1*0.) 

CALL 

PRINT ( 1  OH 

NP2D 

♦10H.G12.4 

* NP2D*  3*  0 • ) 

CALL 

PRINTOOH 

NP2 

*10H«G12.4 

•NP2«1 *0.) 

CALL 

PRINTOOH 

PP2D 

•10H,G12.4 

«PP2D.3*0.) 

CALL 

PRINTOOH 

PP2 

«10H«G12.4 

*PP2*1 *0.) 

CALL 

PRINTOOH 

NP3D 

*10H,G12.4 

*NP30.3*0.) 

call 

PRINTOOH 

NP3 

•10H.G12.4 

♦NP3* 1*0.) 

CALL 

PRINTOOH 

PP3D 

tl0H,Gl?.4 

♦PP3D.3.0.) 

CALL 

PRINTOOH 

PP3 

« 10H*G1 2»4 

*PP3* 1 *0* ) 

call 

PRINTOOH 

NP4D 

•10H.G12.4 

*NP4D.3*0.) 

CALL 

PRINTOOH 

NP4 

f 10H.G12.4 

«NP4* 1 *0* ) 

CALL 

PRINTOOH 

PP4D 

♦10H,G12.4 

•PP4D.3.0.) 

CALL 

PRINTOOH 

PP4 

OOH*612.4 

»PP4*1»0.) 

CALL 

PRINTOOH 

NH4D 

» 10H*G1 2*4 

•NH4D.3*0*> 

call 

PRINTOOH 

NH4 

«10H,G12.4 

•NH4«1«0.) 

CALL 

PRINTOOH 

PH4D 

f 10HtG12.4 

♦  PH4D*  3*0. ) 

call 

PRINTOOH 

PH4 

» 10H.G12.4 

»  PH4  *1*0.) 

30  CALL 

PRINTOOH 

K 1  (T) 

*10HVG12.4 

•  KT1 «S«0. ) 

call 

PRINTOOH 

K2(T) 

«10H«G12.4 

*KT2»5*  0* ) 

RETURN 

SO  T60*TF-TIME 

Tl*l .-£XP<-TG0/TAU1 ) -TG0/TAU1 
T2*l.-fcXP<-TG0/TAU?)-TG0/TAU2 

S=6.«RUS*R?2S/(6.*R1 1S*R22S»AS*KT1**2*R22«»*<6.*TAU1**?*TG0-6.*TAU 
1  l*TG0**2*2.*TG0*«3*3.*TAUl**3*O.-EX°<-?.*TG0/TAUl>  >-12.*TAUl**2*T 
2G0«f *P<-TG0/TAU1>  >-AS*KT2**2*RllS*(*.*TAU?**2*TG0-6.«TAU2*TG0**2*2 
3.*TG0*«3*3.*TAU?«*3*(1.-EXP(-?.*TG0/TAI12) )-12.*TAU2**?*TG0*EXP<-TG 
40/TAU2) > ) 

K1*AS*KT1**2*TAU1»*2*T1**2/RUS 
K2*AS*KT?**2*T4U2*»2*T?**2/R22S 
NO =-*000000015 
P0*1000.*EXP<-.5*TIME> 

NP1D«2.*<KI*NO*K1»S»PO/W2)«NP1*2.*NO*PP1/W2-K1*NO*NO-?.*NO*PO/W2*K 

12*S*S 

PP1D*-2.*K1*P0*NP1-2.*(K1*N0*K1*S»P0/*2)*PP1*P0*P0/W2*?.*K1*N0«P0 
NP20*2.* (Kl*NPl *K1*S*PP1/WP) *NP?*2.*NR1*PP2/W2-K1*NP1*NP1-2.*NP1«P 


153 


1P1/W2*K2*S*S 

PP?Ds-2.*Kl*PPl*NP2-2.*(Kl*NPl*Kl*S^PPl/W2)*PP2*PPl*PPl/W2*2.*Kl*N 

1P1*PP1 

NP3D*2.*(K1*NP2*K1*S*PP2/W2)*NP342.*NP2*PP3/W2-K1*NP2*NP2-2.*NP2*P 

1P2/W2»K2*S*S 

PP3D*-2.*K1*PP2*NP3-2.*(K1«NP2*K1*«UPP2/W2)*PP34PP2»PP2/W2*2.*K1*N 

|P2*PP2 

NP40=2.*(K1*NP3*K1*S*PP3/W2)*NP4.2.*NP3*PP4/W2-K1«NP3*NP3-2.*NP3*P 

1P3/W2+K2#S*S 

PP4D*-2.*K1*PP3*NP4-2.*1K1*NP3*K1*S*PP3/W2)*PP4*PP3*PP3/W2*2.*K1*N 

1P3*PP3 

NH4D=2.*!K1*NP3*K1*S»PP3/W2)*NH4*2.*NP3*PH4/W2 

PH40=-2.*K1*PP3*NH4-2.*(K1*NP3*K1*S*PP3/W2)*PH4 

IF (NF 1R5T) 60*80,60 
60  A4S-NP4/NH4 

WRITE  <6.70) NP1 .NP2.W2.A4 

70  FORMAT! 1H0.4X.4HN  *  G14.6.5X.4HP  *  G14.6.5X.SHW2  *  G14.6.SHA4 

1*622.14) 

BO  RETURN 

END 


-  FINAL  SOLUTION 


SUBROUTINE  FUNEV 

COMMON  TIME.DELT.NSTART.NFIRST.NEXIT.IPASS.ROMCONC2094) 

REAL  K1.K2.KT1.KT2.N.N0 

REAL  NO . NP 1 . NP 1 D . NH 1 • NH 1 0 . NP2 «  NP2D • NH2 , NH20 
REAL  NP3  *  NH3  *  NP3D » NH3D 
REAL  NP4  * NH4  «  NP4D «  NM40 

REAL  ND2.ND21 « J1 • J1D* J2* J2D* JR1 * JR2. JR1 1  * JR1 2* JR21 * JR22.NND2 
DATA  TF,AS,KT1»KT2,TAU1 «  TAU2.R1 1S.R22S/10. *.04«32.2,3?.2*1.*2.«.00 
11 *.004/ 

IF (NSTART) 30.50*10 

10  READ (5.20) NP1 *PP1 * W2.NP2.PP2.NP3.PP3.NP4.PP4.P2 
20  FORMAT (4E20.0) 

CALL  INTG(NPID.NPI) 

CALl  INTG(PPID.PPI) 

CALL  INTG (NP2D.NP2) 

CALL  INTG (PP20.PP2) 

CALL  INTG (NP30.NP3) 

CALL  INTG (PP30.PP3) 

CALL  INTG 1NP4D.NP4) 

CALL  INTG (PP40.PP4) 

CALL  INTG !P?O.P?) 

CALL  INTG(JIO.JI) 

CALL  INTGIJ20.J?) 

CALL  PRINT ! 10H  S(T) ♦ 10H.G12.4  .S.1.0.) 

CALL  PRINT ( 10H  NP1D. 10H.G12.4  .NP10.3.0.) 

CALL  PRINTI10H  NP1  . 10H. 612.4  .NP1.1.0.) 

CALL  PRINT ( 10H  PP10. 10H.G12.4  .PP1D.3.0.) 

CALL  PRINT  C 10H  PP 1  .10M.G12.4  .PP1.1.0.) 


CALL 

PRINT ( 10H 

NP2D 

• 10H.G12.4 

•  NP2D 

•3*0.) 

CALL 

PRINT ( 10H 

NP2 

* 10H.G1 2.4 

*NP2* 

1*0.) 

CALL 

PRINT ( 10H 

PP2D 

• 10H.G12.4 

•  PP2  0 

♦3,0.) 

CALL 

PRINT (10H 

PP2 

* 10H.G12.4 

♦  PP2* 

1.0.) 

CALL 

PRINT ( 10H 

NP3D 

« 10H.G12.4 

.NP30 

•3*0.) 

CALL 

PRINT ( 10H 

NP3 

• 10H.G12.4 

♦  NP3. 

1*0.) 

CALL 

PRINT ( 10H 

PP3D 

• 10H.G12.4 

*PP3D 

•3*0.) 

CALL 

PRINT ( 10H 

PP3 

•1 OH *612. 4 

»PP3* 

1,0.) 

call 

PRINT  < 10H 

NP4D 

* 10H» G12.4 

♦  NP40 

•3*0.) 

CALL 

PrINKIOH 

NP4 

•10H.G12.4 

•  NP4* 

1*0.) 

CALL 

PRINT ( 10H 

PP4D 

* 10H. 61 2.4 

«PP4D 

•3*0.) 

CALL 

print ( l oh 

PP4 

•10H.G12.4 

♦  PP4* 

1*0.) 

call 

PRINT ( 10H 

N02 

. 10H.G12.4 

*N02* 

1,0.) 

CALL 

PRINT ( 10H 

N021 

» 10H.G12.4 

♦  ND21 

•  1.0.) 

CALL 

PRINT ( 10H 

NN02 

•10H.G12.4 

•  NND2 

•  1*0.) 

CALL 

PRINT ( 10H 

G1S 

• 10H.G12.4 

*G1  S* 

1*0.) 

CALL 

PRINT ( 10H 

G2S 

* 10H.G12.4 

•  G2S* 

1.0.) 

CALL 

PRINT ( 1  OH 

G1N2 

• 10H*G12.4 

*G1N2 

•1*0.) 

CALL 

PRINT ( 10H 

G1N1 

•10H.G12.4 

•  G1N1 

•1*0.) 

CALL 

PRINT ( 10H 

GIN 

« 10H.G12.4 

•  GIN, 

1*0.) 

CALL 

PRINT ( 10H 

P20 

t 10H.G1 2.4 

•  P2D* 

3*0.) 

CALL 

PRINT ( 10H 

P2 

• 10H.G12.4 

*P2*  1 

•  0.) 

CALL 

PRINT ( 10H 

J10 

•10H.G12.4 

•  J1D* 

3*0.) 

CALL 

PRINT ( 1  OH 

J1 

•10H.G12.4 

.Jl.l 

•  0.) 

CALL 

PRINT ( 10H 

J20 

♦  10H. 612.4 

•  J2D* 

3*0.) 

CALL 

PRINT ( 10H 

J2 

• I0H.G1 2.4 

•  Jl.l 

•  0.) 

CALL 

PRINT ( 10H 

JR1 1 

*  10H.G12.** 

•  JR1 1 

•  1*0.) 

CALL 

PRINT (10H 

JR  12 

* 10H.G12.4 

•  JR  12 

•  1*0.) 

CALL 

PRINT ( 10H 

JR2I 

*1  OH *612.4 

•  JR21 

•  1*0.) 

CALL 

PRINT (10H 

JR22 

.10H.G12.4 

•  JR22 

•  1*0.) 

call 

PRINT ( 10H 

JR1 

•10H.G12.b 

*  JR1  • 

1.0.) 

CALL 

PRINT ( 1  OH 

JR2 

• 10H*G12.b 

*  JR2. 

1.0.) 

CALL 

PRINT (10H 

Kl  (T) 

« 10H.G12.4 

•  K 1  *5 

.0.) 

CALL 

PRINT (10H 

*K2  (T) 

«10H*G12.4 

•  K2.5 

•  0.) 

assort (AS) 

Rll*SORT(RUS) 

R22*SQRT (R22S) 

1*1 

RETURN 

50  TGOsTF-TlHE 

Tl*l .-EXP (-TGO/TAU1 ) -TGO/TAU1 
T?*l .-EXP (-TGO/TAU2) -TGO/TAU2 

S*6.#R1 1S*R22S/ (b.*Rl 1S*R22S»AS*KTI**2*R22S* (b.*TAUl**2*TG0-b.*TAU 
U*TGO**2*2.*TGO«*3»3.«TAUl**3*(l.-FXP(-2.*TGO/TAUl) )-l?.*TAUl**2»T 
2G0*EXP (-TGO/TAU1 ) ) -AS*KT2*«2*R1 1 S* (6.*  TAU2**2*TG0-6.«TAU2*TG0»*2*2 
3.*TG0«*3O.*TAU2**3*(l.-EXP(-2.*TG0/TAU2) > -12.*TAU2**2«TGO*EXP (-TG 
40/TAU2) ) ) 

K1*A5«KT1**2*TAU1**2*T1**2/R11S 

K2«AS*KT2**2*TAU2**2*T2**2/R22S 

ND2*b.*Rl IS/ (6.*R1 1S*AS*KT1**2* (6.*TAU1**2*TG0-6.*TAU1*TG0**2»?.*T 
IG0*«3*3.*TAU1**'3*(1  .-EXP  (-Z.’TGO/TAlJl  > ) -12.*TAUl*«2*TGO*EXP  (-TGO/T 
?AU1))) 


